This analysis is going to look at the count of gestures as a function whether the concomitant word is iconic or not.
There are three main variables:
gesture yes/no, which includes all types
gestures (iconic, deictic, beat, metaphoric)iconic_gesture yes/no, which looks at whether those
videos with a gesture also feature an iconic gesturenon_iconic_gesture, which is gesture
- iconic gesture is constructed in the scriptThe models will not be computed in the markdown due to the long
estimation time. This is why the code chunks will have
eval = TRUE as setting. Instead, the pre-compiled models
are loaded into R from the models folder.
Let’s load packages:
# Load packages:
library(tidyverse) # for data processing and visualization
library(patchwork) # for multi-plot arrays
library(brms) # for bayesian analysis
library(effsize) # for Cohen's d effect size
library(dplyr) # for data manipulation
For reproducibility, report version numbers:
R.Version()$version.string
## [1] "R version 4.4.0 (2024-04-24)"
packageVersion('tidyverse')
## [1] '2.0.0'
packageVersion('patchwork')
## [1] '1.2.0'
packageVersion('brms')
## [1] '2.21.0'
Load data:
df <- read_csv('../data/ALL_videos_coded_18_06_24.csv')
Let’s get rid of everything that doesn’t have a number, which are the ones that have not been coded.
df <- filter(df, !is.na(word_used))
Let’s make the word column lowercase:
df <- mutate(df, word = str_to_lower(word))
Let’s see for how many the word wasn’t there?
df |>
count(word_used) |>
mutate(proportion = n / sum(n))
## # A tibble: 2 × 3
## word_used n proportion
## <dbl> <int> <dbl>
## 1 0 1921 0.336
## 2 1 3804 0.664
Let’s reduce the data frame to only those cases where
word_used is equal to 1.
df <- filter(df, word_used == 1)
Let’s do a count of the duplicates:
df |>
count(not_duplicate) |>
mutate(proportion = n / sum(n))
## # A tibble: 2 × 3
## not_duplicate n proportion
## <dbl> <int> <dbl>
## 1 0 682 0.179
## 2 1 3122 0.821
Let’s get rid of the duplicates:
df <- filter(df, not_duplicate == 1)
Let’s count whether the speaker was visible:
df |>
count(speaker_visible) |>
mutate(proportion = n / sum(n))
## # A tibble: 2 × 3
## speaker_visible n proportion
## <dbl> <int> <dbl>
## 1 0 899 0.288
## 2 1 2223 0.712
Let’s take only those for which the speaker is visible:
df <- filter(df, speaker_visible == 1)
Let’s count whether the hands were visible:
df |>
count(hands_visible) |>
mutate(proportion = n / sum(n))
## # A tibble: 2 × 3
## hands_visible n proportion
## <dbl> <int> <dbl>
## 1 0 601 0.270
## 2 1 1622 0.730
Let’s take only those for which the hands are visible:
df <- filter(df, hands_visible == 1)
Let’s count whether the hands were free:
df |>
count(hands_free) |>
mutate(proportion = n / sum(n))
## # A tibble: 2 × 3
## hands_free n proportion
## <dbl> <int> <dbl>
## 1 0 76 0.0469
## 2 1 1546 0.953
The 0’s here include only those cases for which it was
clearly impossible for the speaker to gesture. So, we’ll gesture.
df <- filter(df, hands_free == 1)
Not 100% sure about this, but for now, let’s make the
iconic_gesture cases that have 0 for the
gesture column into NA, just so we don’t treat
those as non-iconic gestures by accident.
df <- mutate(df,
iconic_gesture = ifelse(gesture == 0, 0, iconic_gesture))
For plotting later, it makes sense to switch the
non_iconic label to a less techy-looking
non-iconic, and also change the order so that the
non-iconic level then comes first:
df <- mutate(df,
type = ifelse(type == 'non_iconic',
'non-iconic \nword', 'iconic \nword'),
type = factor(type,
levels = c('non-iconic \nword', 'iconic \nword')))
Let’s check the count of words by itself:
# Save the word counts:
word_counts <- df |>
count(type, word, sort = TRUE)
# Show:
word_counts |>
print(n = Inf)
## # A tibble: 45 × 3
## type word n
## <fct> <chr> <int>
## 1 "non-iconic \nword" said 303
## 2 "iconic \nword" spank 97
## 3 "iconic \nword" slushy 67
## 4 "non-iconic \nword" realize 58
## 5 "non-iconic \nword" inform 57
## 6 "non-iconic \nword" wearing 56
## 7 "iconic \nword" yucky 56
## 8 "non-iconic \nword" knew 54
## 9 "non-iconic \nword" filling 51
## 10 "non-iconic \nword" other 50
## 11 "non-iconic \nword" exact 46
## 12 "non-iconic \nword" grateful 43
## 13 "non-iconic \nword" prevail 40
## 14 "iconic \nword" squish 37
## 15 "non-iconic \nword" tamper 36
## 16 "iconic \nword" splotch 36
## 17 "non-iconic \nword" put 35
## 18 "non-iconic \nword" confirmed 34
## 19 "non-iconic \nword" rejoin 34
## 20 "non-iconic \nword" discern 32
## 21 "non-iconic \nword" jealous 30
## 22 "iconic \nword" puffy 24
## 23 "non-iconic \nword" covet 22
## 24 "non-iconic \nword" ordain 21
## 25 "iconic \nword" swish 21
## 26 "iconic \nword" wring 21
## 27 "iconic \nword" saggy 17
## 28 "iconic \nword" swoosh 15
## 29 "iconic \nword" zap 15
## 30 "non-iconic \nword" outwit 14
## 31 "iconic \nword" chomp 14
## 32 "iconic \nword" crispy 14
## 33 "non-iconic \nword" absent 13
## 34 "iconic \nword" wheeze 13
## 35 "iconic \nword" woof 13
## 36 "non-iconic \nword" sullen 11
## 37 "iconic \nword" barking 11
## 38 "non-iconic \nword" acquaint 8
## 39 "iconic \nword" bang 6
## 40 "iconic \nword" munch 5
## 41 "iconic \nword" plump 5
## 42 "iconic \nword" wobbly 5
## 43 "non-iconic \nword" barren 3
## 44 "iconic \nword" snap 2
## 45 "iconic \nword" gooey 1
Create a non_iconic_gesture variable which is
gesture minus iconic_gesture… this variable
then specifically looks at all the cases of gestures that are not
iconic.
# Seed with NA values:
df$non_iconic_gesture <- NA
# Create variable:
df <- mutate(df,
non_iconic_gesture = case_when(gesture == 1 & iconic_gesture == 0 ~ 'gesture (non-iconic)',
gesture == 0 & iconic_gesture == 0 ~ 'no gesture'))
# Double check:
df |>
distinct(gesture, iconic_gesture, non_iconic_gesture) |>
select(gesture, iconic_gesture, non_iconic_gesture)
## # A tibble: 4 × 3
## gesture iconic_gesture non_iconic_gesture
## <dbl> <dbl> <chr>
## 1 1 0 gesture (non-iconic)
## 2 0 0 no gesture
## 3 1 1 <NA>
## 4 1 NA <NA>
That’s correct.
Let’s count the overall average gesture rate:
# Save:
gesture_counts <- df |>
count(gesture) |>
mutate(proportion = n / sum(n))
# Show:
gesture_counts
## # A tibble: 2 × 3
## gesture n proportion
## <dbl> <int> <dbl>
## 1 0 664 0.429
## 2 1 882 0.571
Let’s count iconic gestures:
# Save:
iconic_counts <- df |>
filter(gesture == 1) |>
count(iconic_gesture) |>
mutate(proportion = n / sum(n))
# Show:
iconic_counts
## # A tibble: 3 × 3
## iconic_gesture n proportion
## <dbl> <int> <dbl>
## 1 0 671 0.761
## 2 1 210 0.238
## 3 NA 1 0.00113
Let’s count the overall gestures by word type:
# Save:
all_gesture_by_type <- df |>
count(type, gesture) |>
group_by(type) |>
mutate(proportion = n / sum(n))
# Show:
all_gesture_by_type
## # A tibble: 4 × 4
## # Groups: type [2]
## type gesture n proportion
## <fct> <dbl> <int> <dbl>
## 1 "non-iconic \nword" 0 514 0.489
## 2 "non-iconic \nword" 1 537 0.511
## 3 "iconic \nword" 0 150 0.303
## 4 "iconic \nword" 1 345 0.697
Let’s count the iconic gestures by word type, over all gestures:
# Save:
iconic_by_type_over_gestures <- df |>
filter(gesture == 1) |>
count(type, iconic_gesture) |>
group_by(type) |>
mutate(proportion = n / sum(n))
# Show:
iconic_by_type_over_gestures
## # A tibble: 5 × 4
## # Groups: type [2]
## type iconic_gesture n proportion
## <fct> <dbl> <int> <dbl>
## 1 "non-iconic \nword" 0 435 0.810
## 2 "non-iconic \nword" 1 101 0.188
## 3 "non-iconic \nword" NA 1 0.00186
## 4 "iconic \nword" 0 236 0.684
## 5 "iconic \nword" 1 109 0.316
Let’s do the same again, but this time only for those that are not iconic gestures.
# Save:
other_by_type_over_gestures <- df |>
filter(gesture == 1) |>
count(type, non_iconic_gesture) |>
group_by(type) |>
mutate(proportion = n / sum(n))
# Show:
other_by_type_over_gestures
## # A tibble: 4 × 4
## # Groups: type [2]
## type non_iconic_gesture n proportion
## <fct> <chr> <int> <dbl>
## 1 "non-iconic \nword" gesture (non-iconic) 435 0.810
## 2 "non-iconic \nword" <NA> 102 0.190
## 3 "iconic \nword" gesture (non-iconic) 236 0.684
## 4 "iconic \nword" <NA> 109 0.316
Let’s count the iconic gestures by word type, over all eligible tokens:
#Save:
iconic_by_type_over_eligible_tokens <- df |>
filter(gesture == 1) |>
count(type, iconic_gesture) |>
group_by(type) |>
mutate(proportion = n / sum(df$hands_free[df$type == type] == 1))
## Warning: There were 2 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `proportion = n/sum(df$hands_free[df$type == type] == 1)`.
## ℹ In group 1: `type = non-iconic word`.
## Caused by warning in `==.default`:
## ! longer object length is not a multiple of shorter object length
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
#Show:
iconic_by_type_over_eligible_tokens
## # A tibble: 5 × 4
## # Groups: type [2]
## type iconic_gesture n proportion
## <fct> <dbl> <int> <dbl>
## 1 "non-iconic \nword" 0 435 0.414
## 2 "non-iconic \nword" 1 101 0.0961
## 3 "non-iconic \nword" NA 1 0.000951
## 4 "iconic \nword" 0 236 0.477
## 5 "iconic \nword" 1 109 0.220
Let’s do the same again, but this time only for those that are not iconic gestures.
# Save:
other_by_type_over_eligible_tokens <- df |>
filter(gesture == 1) |>
count(type, non_iconic_gesture) |>
group_by(type) |>
mutate(proportion = n / sum(df$hands_free[df$type == type] == 1))
#Show:
other_by_type_over_eligible_tokens
## # A tibble: 4 × 4
## # Groups: type [2]
## type non_iconic_gesture n proportion
## <fct> <chr> <int> <dbl>
## 1 "non-iconic \nword" gesture (non-iconic) 435 0.414
## 2 "non-iconic \nword" <NA> 102 0.0971
## 3 "iconic \nword" gesture (non-iconic) 236 0.477
## 4 "iconic \nword" <NA> 109 0.220
Let’s do both of these things on a word by word basis. Gesture counts first:
# Save counts:
all_gesture_by_word <- df |>
count(word, gesture) |>
group_by(word) |>
mutate(proportion = n / sum(n)) |>
filter(gesture == 1) |>
select(-gesture)
# Show:
all_gesture_by_word |>
arrange(desc(proportion)) |>
print(n = Inf)
## # A tibble: 44 × 3
## # Groups: word [44]
## word n proportion
## <chr> <int> <dbl>
## 1 bang 6 1
## 2 snap 2 1
## 3 splotch 34 0.944
## 4 chomp 13 0.929
## 5 squish 33 0.892
## 6 zap 13 0.867
## 7 exact 38 0.826
## 8 wring 17 0.810
## 9 plump 4 0.8
## 10 wobbly 4 0.8
## 11 filling 40 0.784
## 12 discern 25 0.781
## 13 put 27 0.771
## 14 ordain 16 0.762
## 15 slushy 49 0.731
## 16 crispy 10 0.714
## 17 swish 15 0.714
## 18 wearing 40 0.714
## 19 spank 67 0.691
## 20 barren 2 0.667
## 21 swoosh 10 0.667
## 22 other 33 0.66
## 23 tamper 23 0.639
## 24 barking 7 0.636
## 25 covet 14 0.636
## 26 acquaint 5 0.625
## 27 woof 8 0.615
## 28 realize 35 0.603
## 29 munch 3 0.6
## 30 knew 32 0.593
## 31 outwit 8 0.571
## 32 puffy 13 0.542
## 33 rejoin 17 0.5
## 34 confirmed 16 0.471
## 35 jealous 14 0.467
## 36 yucky 26 0.464
## 37 sullen 5 0.455
## 38 wheeze 5 0.385
## 39 said 111 0.366
## 40 saggy 6 0.353
## 41 prevail 14 0.35
## 42 grateful 12 0.279
## 43 inform 9 0.158
## 44 absent 1 0.0769
Proportion of iconic gestures (out of videos with gesture) by word:
# Save counts:
iconic_by_word_over_gestures <- df |>
count(word, iconic_gesture) |>
group_by(word) |>
mutate(proportion = n / sum(n)) |>
filter(iconic_gesture == 1) |>
select(-iconic_gesture, -n)
# Show:
iconic_by_word_over_gestures |>
arrange(desc(proportion)) |>
print(n = Inf)
## # A tibble: 35 × 2
## # Groups: word [35]
## word proportion
## <chr> <dbl>
## 1 bang 0.833
## 2 squish 0.811
## 3 wobbly 0.6
## 4 wring 0.524
## 5 put 0.429
## 6 woof 0.385
## 7 chomp 0.357
## 8 barren 0.333
## 9 zap 0.333
## 10 swish 0.286
## 11 splotch 0.278
## 12 tamper 0.222
## 13 discern 0.219
## 14 exact 0.217
## 15 munch 0.2
## 16 other 0.18
## 17 filling 0.137
## 18 swoosh 0.133
## 19 spank 0.124
## 20 slushy 0.119
## 21 inform 0.105
## 22 jealous 0.1
## 23 covet 0.0909
## 24 puffy 0.0833
## 25 wheeze 0.0769
## 26 knew 0.0741
## 27 crispy 0.0714
## 28 outwit 0.0714
## 29 realize 0.0690
## 30 confirmed 0.0588
## 31 rejoin 0.0588
## 32 wearing 0.0536
## 33 said 0.0495
## 34 grateful 0.0465
## 35 yucky 0.0357
Proportion of other gestures (out of videos with gesture) by word:
# Save counts:
other_by_word_over_gestures <- df |>
filter(gesture == 1) |>
count(word, non_iconic_gesture) |>
group_by(word) |>
mutate(proportion = n / sum(n)) |>
select(-non_iconic_gesture, -n)
# Show:
other_by_word_over_gestures |>
arrange(desc(proportion)) |>
print(n = Inf)
## # A tibble: 79 × 2
## # Groups: word [44]
## word proportion
## <chr> <dbl>
## 1 absent 1
## 2 acquaint 1
## 3 barking 1
## 4 ordain 1
## 5 plump 1
## 6 prevail 1
## 7 saggy 1
## 8 snap 1
## 9 sullen 1
## 10 wearing 0.925
## 11 yucky 0.923
## 12 squish 0.909
## 13 crispy 0.9
## 14 rejoin 0.882
## 15 confirmed 0.875
## 16 knew 0.875
## 17 outwit 0.875
## 18 said 0.865
## 19 covet 0.857
## 20 realize 0.857
## 21 puffy 0.846
## 22 slushy 0.837
## 23 bang 0.833
## 24 grateful 0.833
## 25 filling 0.825
## 26 spank 0.821
## 27 swoosh 0.8
## 28 wheeze 0.8
## 29 jealous 0.786
## 30 wobbly 0.75
## 31 exact 0.737
## 32 other 0.727
## 33 discern 0.72
## 34 splotch 0.706
## 35 inform 0.667
## 36 munch 0.667
## 37 tamper 0.652
## 38 wring 0.647
## 39 woof 0.625
## 40 chomp 0.615
## 41 zap 0.615
## 42 swish 0.6
## 43 put 0.556
## 44 barren 0.5
## 45 barren 0.5
## 46 put 0.444
## 47 swish 0.4
## 48 chomp 0.385
## 49 zap 0.385
## 50 woof 0.375
## 51 wring 0.353
## 52 tamper 0.348
## 53 inform 0.333
## 54 munch 0.333
## 55 splotch 0.294
## 56 discern 0.28
## 57 other 0.273
## 58 exact 0.263
## 59 wobbly 0.25
## 60 jealous 0.214
## 61 swoosh 0.2
## 62 wheeze 0.2
## 63 spank 0.179
## 64 filling 0.175
## 65 bang 0.167
## 66 grateful 0.167
## 67 slushy 0.163
## 68 puffy 0.154
## 69 covet 0.143
## 70 realize 0.143
## 71 said 0.135
## 72 confirmed 0.125
## 73 knew 0.125
## 74 outwit 0.125
## 75 rejoin 0.118
## 76 crispy 0.1
## 77 squish 0.0909
## 78 yucky 0.0769
## 79 wearing 0.075
Proportion of iconic gestures (out of eligible tokens) by word:
# Save counts:
iconic_by_word_over_eligible_tokens <- df |>
filter(iconic_gesture == 1) |>
count(word, iconic_gesture) |>
group_by(word) |>
mutate(proportion = n / sum(df$hands_free[df$word == word] == 1)) |>
select(-iconic_gesture)
# Show:
iconic_by_word_over_eligible_tokens |>
arrange(desc(proportion)) |>
print(n = Inf)
## # A tibble: 35 × 3
## # Groups: word [35]
## word n proportion
## <chr> <int> <dbl>
## 1 bang 5 0.833
## 2 squish 30 0.811
## 3 wobbly 3 0.6
## 4 wring 11 0.524
## 5 put 15 0.429
## 6 woof 5 0.385
## 7 chomp 5 0.357
## 8 barren 1 0.333
## 9 zap 5 0.333
## 10 swish 6 0.286
## 11 splotch 10 0.278
## 12 tamper 8 0.222
## 13 discern 7 0.219
## 14 exact 10 0.217
## 15 munch 1 0.2
## 16 other 9 0.18
## 17 filling 7 0.137
## 18 swoosh 2 0.133
## 19 spank 12 0.124
## 20 slushy 8 0.119
## 21 inform 6 0.105
## 22 jealous 3 0.1
## 23 covet 2 0.0909
## 24 puffy 2 0.0833
## 25 wheeze 1 0.0769
## 26 knew 4 0.0741
## 27 crispy 1 0.0714
## 28 outwit 1 0.0714
## 29 realize 4 0.0690
## 30 confirmed 2 0.0588
## 31 rejoin 2 0.0588
## 32 wearing 3 0.0536
## 33 said 15 0.0495
## 34 grateful 2 0.0465
## 35 yucky 2 0.0357
Proportion of other/non-iconic gestures (out of eligible tokens) by word:
# Save counts:
other_by_word_over_eligible_tokens <- df |>
filter(gesture == 1) |>
count(word, non_iconic_gesture) |>
group_by(word) |>
mutate(proportion = n / sum(df$hands_free[df$word == word] == 1))|>
filter(!is.na(non_iconic_gesture))|> # Filter out rows with NA in non_iconic_gesture
select(-non_iconic_gesture)
# Show:
other_by_word_over_eligible_tokens |>
arrange(desc(proportion)) |>
print(n = Inf)
## # A tibble: 44 × 3
## # Groups: word [44]
## word n proportion
## <chr> <int> <dbl>
## 1 snap 2 1
## 2 plump 4 0.8
## 3 ordain 16 0.762
## 4 splotch 24 0.667
## 5 wearing 37 0.661
## 6 filling 33 0.647
## 7 crispy 9 0.643
## 8 barking 7 0.636
## 9 acquaint 5 0.625
## 10 slushy 41 0.612
## 11 exact 28 0.609
## 12 chomp 8 0.571
## 13 spank 55 0.567
## 14 discern 18 0.562
## 15 covet 12 0.545
## 16 swoosh 8 0.533
## 17 zap 8 0.533
## 18 knew 28 0.519
## 19 realize 30 0.517
## 20 outwit 7 0.5
## 21 other 24 0.48
## 22 puffy 11 0.458
## 23 sullen 5 0.455
## 24 rejoin 15 0.441
## 25 swish 9 0.429
## 26 yucky 24 0.429
## 27 tamper 15 0.417
## 28 confirmed 14 0.412
## 29 munch 2 0.4
## 30 jealous 11 0.367
## 31 saggy 6 0.353
## 32 prevail 14 0.35
## 33 put 12 0.343
## 34 barren 1 0.333
## 35 said 96 0.317
## 36 wheeze 4 0.308
## 37 wring 6 0.286
## 38 grateful 10 0.233
## 39 woof 3 0.231
## 40 wobbly 1 0.2
## 41 bang 1 0.167
## 42 squish 3 0.0811
## 43 absent 1 0.0769
## 44 inform 3 0.0526
Let’s merge the counts and the proportions into a big table showing everything on a by-word basis:
# Save:
by_word_all <- word_counts |>
rename(no_of_eligible_tokens = n) |>
left_join(all_gesture_by_word, by = "word") |>
rename(gesture_rate = proportion) |>
rename(no_of_gesture = n) |>
left_join(iconic_by_word_over_eligible_tokens, by = "word") |>
rename(iconic_gesture_rate = proportion) |>
rename(no_of_iconic = n) |>
left_join(other_by_word_over_eligible_tokens, by = "word") |>
rename(other_gesture_rate = proportion) |>
rename(no_of_other = n) |>
mutate(type = str_replace(type, '\n', ''),
type = factor(type, levels = c('non-iconic word', 'iconic word')))
# Show:
by_word_all |>
print(n = Inf)
## # A tibble: 45 × 9
## type word no_of_eligible_tokens no_of_gesture gesture_rate no_of_iconic
## <fct> <chr> <int> <int> <dbl> <int>
## 1 non-icon… said 303 111 0.366 15
## 2 iconic w… spank 97 67 0.691 12
## 3 iconic w… slus… 67 49 0.731 8
## 4 non-icon… real… 58 35 0.603 4
## 5 non-icon… info… 57 9 0.158 6
## 6 non-icon… wear… 56 40 0.714 3
## 7 iconic w… yucky 56 26 0.464 2
## 8 non-icon… knew 54 32 0.593 4
## 9 non-icon… fill… 51 40 0.784 7
## 10 non-icon… other 50 33 0.66 9
## 11 non-icon… exact 46 38 0.826 10
## 12 non-icon… grat… 43 12 0.279 2
## 13 non-icon… prev… 40 14 0.35 NA
## 14 iconic w… squi… 37 33 0.892 30
## 15 non-icon… tamp… 36 23 0.639 8
## 16 iconic w… splo… 36 34 0.944 10
## 17 non-icon… put 35 27 0.771 15
## 18 non-icon… conf… 34 16 0.471 2
## 19 non-icon… rejo… 34 17 0.5 2
## 20 non-icon… disc… 32 25 0.781 7
## 21 non-icon… jeal… 30 14 0.467 3
## 22 iconic w… puffy 24 13 0.542 2
## 23 non-icon… covet 22 14 0.636 2
## 24 non-icon… orda… 21 16 0.762 NA
## 25 iconic w… swish 21 15 0.714 6
## 26 iconic w… wring 21 17 0.810 11
## 27 iconic w… saggy 17 6 0.353 NA
## 28 iconic w… swoo… 15 10 0.667 2
## 29 iconic w… zap 15 13 0.867 5
## 30 non-icon… outw… 14 8 0.571 1
## 31 iconic w… chomp 14 13 0.929 5
## 32 iconic w… cris… 14 10 0.714 1
## 33 non-icon… abse… 13 1 0.0769 NA
## 34 iconic w… whee… 13 5 0.385 1
## 35 iconic w… woof 13 8 0.615 5
## 36 non-icon… sull… 11 5 0.455 NA
## 37 iconic w… bark… 11 7 0.636 NA
## 38 non-icon… acqu… 8 5 0.625 NA
## 39 iconic w… bang 6 6 1 5
## 40 iconic w… munch 5 3 0.6 1
## 41 iconic w… plump 5 4 0.8 NA
## 42 iconic w… wobb… 5 4 0.8 3
## 43 non-icon… barr… 3 2 0.667 1
## 44 iconic w… snap 2 2 1 NA
## 45 iconic w… gooey 1 NA NA NA
## # ℹ 3 more variables: iconic_gesture_rate <dbl>, no_of_other <int>,
## # other_gesture_rate <dbl>
The NA’s in this table are true zeros, so they should be
replaced with 0 proportion.
by_word_all <- mutate(by_word_all,
no_of_gesture = ifelse(is.na(no_of_gesture),
0, no_of_gesture),
gesture_rate = ifelse(is.na(gesture_rate),
0, gesture_rate),
no_of_iconic = ifelse(is.na(no_of_iconic),
0, no_of_iconic),
iconic_gesture_rate = ifelse(is.na(iconic_gesture_rate),
0, iconic_gesture_rate),
no_of_other = ifelse(is.na(no_of_other),
0, no_of_other),
other_gesture_rate = ifelse(is.na(other_gesture_rate),
0, other_gesture_rate))
# Show again:
by_word_all |>
print(n = Inf)
## # A tibble: 45 × 9
## type word no_of_eligible_tokens no_of_gesture gesture_rate no_of_iconic
## <fct> <chr> <int> <dbl> <dbl> <dbl>
## 1 non-icon… said 303 111 0.366 15
## 2 iconic w… spank 97 67 0.691 12
## 3 iconic w… slus… 67 49 0.731 8
## 4 non-icon… real… 58 35 0.603 4
## 5 non-icon… info… 57 9 0.158 6
## 6 non-icon… wear… 56 40 0.714 3
## 7 iconic w… yucky 56 26 0.464 2
## 8 non-icon… knew 54 32 0.593 4
## 9 non-icon… fill… 51 40 0.784 7
## 10 non-icon… other 50 33 0.66 9
## 11 non-icon… exact 46 38 0.826 10
## 12 non-icon… grat… 43 12 0.279 2
## 13 non-icon… prev… 40 14 0.35 0
## 14 iconic w… squi… 37 33 0.892 30
## 15 non-icon… tamp… 36 23 0.639 8
## 16 iconic w… splo… 36 34 0.944 10
## 17 non-icon… put 35 27 0.771 15
## 18 non-icon… conf… 34 16 0.471 2
## 19 non-icon… rejo… 34 17 0.5 2
## 20 non-icon… disc… 32 25 0.781 7
## 21 non-icon… jeal… 30 14 0.467 3
## 22 iconic w… puffy 24 13 0.542 2
## 23 non-icon… covet 22 14 0.636 2
## 24 non-icon… orda… 21 16 0.762 0
## 25 iconic w… swish 21 15 0.714 6
## 26 iconic w… wring 21 17 0.810 11
## 27 iconic w… saggy 17 6 0.353 0
## 28 iconic w… swoo… 15 10 0.667 2
## 29 iconic w… zap 15 13 0.867 5
## 30 non-icon… outw… 14 8 0.571 1
## 31 iconic w… chomp 14 13 0.929 5
## 32 iconic w… cris… 14 10 0.714 1
## 33 non-icon… abse… 13 1 0.0769 0
## 34 iconic w… whee… 13 5 0.385 1
## 35 iconic w… woof 13 8 0.615 5
## 36 non-icon… sull… 11 5 0.455 0
## 37 iconic w… bark… 11 7 0.636 0
## 38 non-icon… acqu… 8 5 0.625 0
## 39 iconic w… bang 6 6 1 5
## 40 iconic w… munch 5 3 0.6 1
## 41 iconic w… plump 5 4 0.8 0
## 42 iconic w… wobb… 5 4 0.8 3
## 43 non-icon… barr… 3 2 0.667 1
## 44 iconic w… snap 2 2 1 0
## 45 iconic w… gooey 1 0 0 0
## # ℹ 3 more variables: iconic_gesture_rate <dbl>, no_of_other <dbl>,
## # other_gesture_rate <dbl>
# Save outside of R:
by_word_all |>
write_csv('../data/by_word_gesture_rates.csv')
Now let’s calculate the average gesture rates by word, then word type (using ‘by_word_all’).
First, overall gesture rate:
# Group by the 'type' column and calculate the average gesture_rate for each group
average_gesture_rate <- by_word_all %>%
group_by(type) %>%
summarise(avg_gesture_rate = mean(gesture_rate, na.rm = TRUE))
# View the result
print(average_gesture_rate)
## # A tibble: 2 × 2
## type avg_gesture_rate
## <fct> <dbl>
## 1 non-iconic word 0.555
## 2 iconic word 0.689
Then, iconic gesture rate:
# Group by the 'type' column and calculate the average iconic_gesture_rate for each group
average_iconic_gesture_rate <- by_word_all %>%
group_by(type) %>%
summarise(avg_gesture_rate = mean(iconic_gesture_rate, na.rm = TRUE))
# View the result
print(average_iconic_gesture_rate)
## # A tibble: 2 × 2
## type avg_gesture_rate
## <fct> <dbl>
## 1 non-iconic word 0.109
## 2 iconic word 0.239
And finally, other gesture rate:
# Group by the 'type' column and calculate the average iconic_gesture_rate for each group
average_other_gesture_rate <- by_word_all %>%
group_by(type) %>%
summarise(avg_gesture_rate = mean(other_gesture_rate, na.rm = TRUE))
# View the result
print(average_other_gesture_rate)
## # A tibble: 2 × 2
## type avg_gesture_rate
## <fct> <dbl>
## 1 non-iconic word 0.444
## 2 iconic word 0.450
We have also coded the videos with gesture for the context of the videos. Let’s see what the proportion of gestures come from each context.
First, unscripted interviews:
# Count the number of gestures (gesture == 1)
total_gestures <- sum(df$gesture == 1)
# Count the number of gestures from unscripted interviews
unscripted_interview_count <- sum(df$gesture == 1 & df$`unscripted interview` == 1)
# Print the result
print(unscripted_interview_count)
## [1] 307
# Calculate the proportion
prop_unscripted_interview <- unscripted_interview_count / total_gestures
# Print the result
print(prop_unscripted_interview)
## [1] 0.3480726
Now, presenting to camera:
# Count the number of gestures from presenting to camera
presenting_camera_count <- sum(df$gesture == 1 & df$`presenting camera` == 1)
# Print the result
print(presenting_camera_count)
## [1] 134
# Calculate the proportion
prop_presenting_camera <- presenting_camera_count / total_gestures
# Print the result
print(prop_presenting_camera)
## [1] 0.1519274
Now, presenting in front of a screen:
# Count the number of gestures from presenting in front of screen
presenting_screen_count <- sum(df$gesture == 1 & df$`presenting screen` == 1)
# Print the result
print(presenting_screen_count)
## [1] 131
# Calculate the proportion
prop_presenting_screen <- presenting_screen_count / total_gestures
# Print the result
print(prop_presenting_screen)
## [1] 0.1485261
Now, giving a speech:
# Count the number of gestures from speeches
giving_speech_count <- sum(df$gesture == 1 & df$`giving speech` == 1)
# Print the result
print(giving_speech_count)
## [1] 195
# Calculate the proportion
prop_giving_speech <- giving_speech_count / total_gestures
# Print the result
print(prop_giving_speech)
## [1] 0.2210884
Speaking in court:
# Count the number of gestures from court
speaking_court_count <- sum(df$gesture == 1 & df$`speaking in court` == 1)
# Print the result
print(speaking_court_count)
## [1] 63
# Calculate the proportion
prop_speaking_court <- speaking_court_count / total_gestures
# Print the result
print(prop_speaking_court)
## [1] 0.07142857
Now, semi-scripted:
# Count the number of gestures from semi-scripted contexts
semi_scripted_count <- sum(df$gesture == 1 & df$`semi-scripted` == 1)
# Print the result
print(semi_scripted_count)
## [1] 33
# Calculate the proportion
prop_semi_scripted <- semi_scripted_count / total_gestures
# Print the result
print(prop_semi_scripted)
## [1] 0.03741497
Finally, scripted acting:
# Count the number of gestures from scripted acting
scripted_acting_count <- sum(df$gesture == 1 & df$`scripted acting` == 1)
# Print the result
print(scripted_acting_count)
## [1] 19
# Calculate the proportion
prop_scripted_acting <- scripted_acting_count / total_gestures
# Print the result
print(prop_scripted_acting)
## [1] 0.02154195
Let’s make a bar plot of the gesture counts:
# Basic plot:
gesture_p <- all_gesture_by_type |>
mutate(gesture = ifelse(gesture == 1, 'gesture', 'no gesture')) |>
mutate(gesture = factor(gesture,
levels = c('no gesture', 'gesture'))) |>
ggplot(aes(x = type,
y = proportion,
fill = gesture)) +
geom_col(width = 0.55)
# Axes and labels:
gesture_p <- gesture_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
scale_y_continuous(expand = c(0, 0)) +
xlab(NULL) +
ylab('All gesture proportion')
# Look and feel:
gesture_p <- gesture_p +
theme_classic()
# Show:
gesture_p
Same for iconic gestures only:
# Basic plot:
iconic_p <- iconic_by_type_over_eligible_tokens |>
mutate(iconic_gesture = ifelse(iconic_gesture == 1,
'iconic gesture',
'other gesture')) |>
mutate(iconic_gesture = factor(iconic_gesture,
levels = c('other gesture', 'iconic gesture'))) |>
ggplot(aes(x = type,
y = proportion,
fill = iconic_gesture)) +
geom_col(width = 0.55)
# Axes and labels:
iconic_p <- iconic_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
labels = c('no gesture', 'gesture'),
name = '') +
scale_y_continuous(expand = c(0, 0)) +
xlab(NULL) +
ylab('Iconic gesture proportion')
# Look and feel:
iconic_p <- iconic_p +
theme_classic()
# Show:
iconic_p
Same for other types of gestures only. This will be the rightmost
plot in our multiple plot array. For this reason we change the variable
to the levels yes and no because then those
levels can serve as legend for all three plots (yes gesture vs. no
gesture, yes iconic versus no iconic etc.)
# Basic plot:
other_p <- other_by_type_over_eligible_tokens |>
mutate(non_iconic_gesture = ifelse(is.na(non_iconic_gesture), 'NA', non_iconic_gesture),
non_iconic_gesture = factor(non_iconic_gesture, levels = c('NA', 'gesture (non-iconic)'))) |>
ggplot(aes(x = type, y = proportion, fill = non_iconic_gesture)) +
geom_col(width = 0.55)
# Axes and labels:
other_p <- other_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
labels = c('no gesture', 'gesture'),
name = '') +
scale_y_continuous(expand = c(0, 0)) +
xlab(NULL) +
ylab('Other gesture proportion')
# Look and feel:
other_p <- other_p +
theme_classic()
# Show:
other_p
Put them all into a three-column plot array, using the
patchwork library. But we should then also
# Add titles and switch off legends except for the last one:
gesture_p <- gesture_p +
ggtitle('a) All gestures') +
theme(legend.position = 'none',
plot.caption = element_text(hjust = 0))
iconic_p <- iconic_p +
ggtitle('b) Iconic gestures') +
theme(legend.position = 'none',
plot.caption = element_text(hjust = 0)) +
ylab(NULL)
other_p <- other_p +
ggtitle('c) Other gestures') +
ylab(NULL) +
theme(plot.caption = element_text(hjust = 0))
# Combine:
three_p <- gesture_p + iconic_p + other_p +
plot_layout(ncol = 3)
# Save combined plot outside:
ggsave(plot = three_p, filename = '../figures/barplots.pdf',
width = 7, height = 3.5)
Let’s make a density plot out of this,
gesture_proportion:
# Plot basics:
ges_prop_p <- by_word_all |>
ggplot(aes(x = gesture_rate, fill = type)) +
geom_density(alpha = 0.5)
# Axes and labels:
ges_prop_p <- ges_prop_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 3),
breaks = seq(0, 3, 0.5)) +
scale_x_continuous(expand = c(0, 0),
limits = c(0, 1),
breaks = seq(0, 1, 0.25)) +
xlab('Gesture proportion of each word') +
ylab('Density')
# Themes:
ges_prop_p <- ges_prop_p +
theme_classic() +
theme(legend.position = 'bottom') +
ggtitle('a) Proportion of all gestures\nacross words')
# Show:
ges_prop_p
Let’s make a density plot out of this,
gesture_proportion:
# Plot basics:
icon_prop_p <- by_word_all |>
ggplot(aes(x = iconic_gesture_rate, fill = type)) +
geom_density(alpha = 0.5)
# Axes and labels:
icon_prop_p <- icon_prop_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 3),
breaks = seq(0, 3, 0.5)) +
scale_x_continuous(expand = c(0, 0),
limits = c(0, 1),
breaks = seq(0, 1, 0.25)) +
xlab('Iconic gesture proportion of each word') +
ylab('Density')
# Themes:
icon_prop_p <- icon_prop_p +
theme_classic() +
theme(legend.position = 'bottom') +
ggtitle('b) Proportion of iconic gestures\nacross words')
# Show:
icon_prop_p
Proportion of other gestures:
# Plot basics:
other_prop_p <- by_word_all |>
ggplot(aes(x = other_gesture_rate, fill = type)) +
geom_density(alpha = 0.5)
# Axes and labels:
other_prop_p <- other_prop_p +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 3),
breaks = seq(0, 3, 0.5)) +
scale_x_continuous(expand = c(0, 0),
limits = c(0, 1),
breaks = seq(0, 1, 0.25)) +
xlab('Other gesture proportion of each word') +
ylab('Density')
# Themes:
other_prop_p <- other_prop_p +
theme_classic() +
theme(legend.position = 'bottom') +
ggtitle('c) Proportion of any other gestures\nacross words')
# Show:
other_prop_p
Put all three density plots into one big plot:
# Add titles and switch off legends except for the last one:
ges_prop_p <- ges_prop_p +
theme(legend.position = 'none',
plot.caption = element_text(hjust = 0))
icon_prop_p <- icon_prop_p +
theme(legend.position = 'none',
plot.caption = element_text(hjust = 0)) +
ylab(NULL)
other_prop_p <- other_prop_p +
ylab(NULL) +
theme(plot.caption = element_text(hjust = 0))
# Combine:
three_prop_p <- ges_prop_p + icon_prop_p + other_prop_p +
plot_layout(ncol = 3)
# Save combined plot outside:
ggsave(plot = three_prop_p, filename = '../figures/density_plots.pdf',
width = 12, height = 3.5)
Let’s see whether we can do a bar plot of all the words for
gesture_rate:
# Plot basics:
word_bars_p <- by_word_all |>
ggplot(aes(x = reorder(word, gesture_rate),
y = gesture_rate, fill = type)) +
geom_col(width = 0.75) +
geom_text(aes(label = no_of_eligible_tokens),
nudge_y = +0.025,
size = 2.7)
# Axes and labels:
word_bars_p <- word_bars_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.1)) +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '',
labels = c('low iconicity word', 'high iconicity word')) +
ylab('Proportion of tokens with gesture') +
xlab(NULL)
# Look and feel:
word_bars_p <- word_bars_p +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = 'bottom')
# Show:
word_bars_p
# Save:
ggsave(plot = word_bars_p, filename = '../figures/by_word_bars.pdf',
width = 10, height = 4)
Same for iconic_gesture_rate:
# Plot basics:
iconic_bars_p <- by_word_all |>
ggplot(aes(x = reorder(word, iconic_gesture_rate),
y = iconic_gesture_rate, fill = type)) +
geom_col(width = 0.75) +
geom_text(aes(label = no_of_eligible_tokens),
nudge_y = +0.025,
size = 2.7)
# Axes and labels:
iconic_bars_p <- iconic_bars_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.1)) +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '',
labels = c('low iconicity word', 'high iconicity word')) +
ylab('Proportion of tokens with iconic gesture') +
xlab(NULL)
# Look and feel:
iconic_bars_p <- iconic_bars_p +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = 'bottom')
# Show:
iconic_bars_p
# Save:
ggsave(plot = iconic_bars_p, filename = '../figures/by_word_iconic_bars.pdf',
width = 10, height = 4)
Same for other_gesture_proportion:
# Plot basics:
other_bars_p <- by_word_all |>
ggplot(aes(x = reorder(word, other_gesture_rate),
y = other_gesture_rate, fill = type)) +
geom_col(width = 0.75) +
geom_text(aes(label = no_of_eligible_tokens),
nudge_y = +0.025,
size = 2.7)
# Axes and labels:
other_bars_p <- other_bars_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.1)) +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '',
labels = c('low iconicity word', 'high iconicity word')) +
ylab('Proportion of tokens with non-iconic gestures') +
xlab(NULL)
# Look and feel:
other_bars_p <- other_bars_p +
theme_classic() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = 'bottom')
# Show:
other_bars_p
# Save:
ggsave(plot = other_bars_p, filename = '../figures/by_word_other_bars.pdf',
width = 10, height = 4)
The \n seems to cause issues when extracting stuff from
the model. So we’ll build the model with cleaned level labels for that
predictor and will call that type_cleand. Let’s also make
sure that the reference level is the non-iconic words:
df <- mutate(df,
type_cleaned = if_else(type == "non-iconic \nword", "non_iconic", "iconic"),
type_cleaned = factor(type_cleaned, levels = c('non_iconic', 'iconic')))
Specify weakly informative priors, specifically for the beta coefficient, using the recommendation by Gelman et al. (2008) to specify a Cauchy prior centered at 0 with scale 2.5.
weak_priors <- c(prior(student_t(3, 0, 2.5), class = Intercept),
prior(student_t(3, 0, 2.5), class = sd),
prior(cauchy(0, 2.5), class = b)) # Gelman et al. (2008)
Let’s fit a model with a fixed effect of type, and
random intercepts for word and url because we
have multiple data points for each of these grouping factors. We cannot
fit type random slopes here because there is no possible
variation whatsoever of type within word or
url since each word or video is always either iconic or not
iconic.
gesture_mdl <- brm(gesture ~
# Fixed effects:
1 + type_cleaned +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_mdl, file = '../models/gesture_mdl.RData')
Load and show model:
# Load:
load('../models/gesture_mdl.Rdata')
# Show:
gesture_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + type_cleaned + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.24 0.26 1.77 2.80 1.00 1782 3753
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.31 0.24 0.91 1.86 1.00 3247 4147
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.57 0.33 -0.07 1.24 1.00 4451 5189
## type_cleanediconic 0.92 0.49 -0.03 1.93 1.00 5739 5109
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Show the priors just to double check:
prior_summary(gesture_mdl)
## prior class coef group resp dpar nlpar lb ub
## cauchy(0, 2.5) b
## cauchy(0, 2.5) b type_cleanediconic
## student_t(3, 0, 2.5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd url 0
## student_t(3, 0, 2.5) sd Intercept url 0
## student_t(3, 0, 2.5) sd word 0
## student_t(3, 0, 2.5) sd Intercept word 0
## source
## user
## (vectorized)
## user
## user
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
Show the posterior:
gesture_mdl_posts <- posterior_samples(gesture_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the type
coefficient:
# Plot basics:
gesture_posts_p <- gesture_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
gesture_posts_p <- gesture_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3, 3),
breaks = seq(-3, 3, 1))
# Look and feel:
gesture_posts_p <- gesture_posts_p +
ylab('Density of posterior samples') +
xlab('Posterior estimate of coefficient') +
theme_classic()
# Show and save:
gesture_posts_p
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_density()`).
ggsave(plot = gesture_posts_p, filename = '../figures/gesture_posterior.pdf',
width = 4.7, height = 3)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_density()`).
This shows that given the model formula, prior, and data, most of the
plausible type effects are positive, which means that it is
more plausible that iconic words also co-occur with gesture. Since the
posterior distribution crosses over zero, it is possible that
the effect could be negative, but this is quite improbable given where
the bulk of the posterior distribution lies.
This is to calculate the actual posterior probability of the effect being positive (= of the same sign), which is essentially just pinning a number to what proportion of the area in the distribution above is to the right of the dashed line.
hypothesis(gesture_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 0.92 0.49 0.12 1.74 32.06
## Post.Prob Star
## 1 0.97 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + type_cleaned +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_mdl, file = '../models/iconic_mdl.RData')
Load and show model:
# Load model:
load('../models/iconic_mdl.RData')
# Show model:
iconic_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + type_cleaned + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.23 0.33 1.64 2.92 1.00 2026 3593
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.67 0.32 1.12 2.40 1.00 3090 4290
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -3.95 0.55 -5.10 -2.95 1.00 2927 4096
## type_cleanediconic 1.51 0.63 0.27 2.76 1.00 3921 4715
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Show the priors to double-check:
prior_summary(iconic_mdl)
## prior class coef group resp dpar nlpar lb ub
## cauchy(0, 2.5) b
## cauchy(0, 2.5) b type_cleanediconic
## student_t(3, 0, 2.5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd url 0
## student_t(3, 0, 2.5) sd Intercept url 0
## student_t(3, 0, 2.5) sd word 0
## student_t(3, 0, 2.5) sd Intercept word 0
## source
## user
## (vectorized)
## user
## user
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
Show the posterior of the type coefficient. First,
extract the posterior samples:
iconic_mdl_posts <- posterior_samples(iconic_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the coefficient:
# Plot basics:
iconic_posts_p <- iconic_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
iconic_posts_p <- iconic_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3, 3),
breaks = seq(-3, 3, 1))
# Look and feel:
iconic_posts_p <- iconic_posts_p +
ylab('Density of posterior samples') +
xlab('Posterior estimate of coefficient') +
theme_classic()
# Show and save:
iconic_posts_p
## Warning: Removed 88 rows containing non-finite outside the scale range
## (`stat_density()`).
ggsave(plot = iconic_posts_p, filename = '../figures/iconic_gesture_posterior.pdf',
width = 4.7, height = 3)
## Warning: Removed 88 rows containing non-finite outside the scale range
## (`stat_density()`).
Check whether the main effect of iconicity is likely to be of the same sign. This is the posterior probability of iconic words having a higher gesture rate:
hypothesis(iconic_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 1.51 0.63 0.5 2.56 118.4
## Post.Prob Star
## 1 0.99 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
The final model is the one with only the non-iconic gestures:
Same for non-iconic (= “other”) gestures:
other_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + type_cleaned +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_mdl, file = '../models/other_mdl.RData')
Load and show model:
# Load model:
load('../models/other_mdl.RData')
# Show model:
other_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + type_cleaned + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.32 0.29 1.80 2.93 1.00 1615 3180
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.21 0.24 0.79 1.74 1.00 2953 4807
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.26 0.31 -0.88 0.35 1.00 3801 4763
## type_cleanediconic -0.46 0.48 -1.41 0.47 1.00 5270 5613
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Show the priors to double-check:
prior_summary(other_mdl)
## prior class coef group resp dpar nlpar lb ub
## cauchy(0, 2.5) b
## cauchy(0, 2.5) b type_cleanediconic
## student_t(3, 0, 2.5) Intercept
## student_t(3, 0, 2.5) sd 0
## student_t(3, 0, 2.5) sd url 0
## student_t(3, 0, 2.5) sd Intercept url 0
## student_t(3, 0, 2.5) sd word 0
## student_t(3, 0, 2.5) sd Intercept word 0
## source
## user
## (vectorized)
## user
## user
## (vectorized)
## (vectorized)
## (vectorized)
## (vectorized)
Show the posterior of the type coefficient. First,
extract the posterior samples:
other_mdl_posts <- posterior_samples(other_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the coefficient:
# Plot basics:
other_posts_p <- other_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
other_posts_p <- other_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3, 3),
breaks = seq(-3, 3, 1))
# Look and feel:
other_posts_p <- other_posts_p +
ylab('Density of posterior samples') +
xlab('Posterior estimate of coefficient') +
theme_classic()
# Show and save:
other_posts_p
ggsave(plot = other_posts_p, filename = '../figures/other_gesture_posterior.pdf',
width = 4.7, height = 3)
Check how many of them are of the same sign:
hypothesis(other_mdl, 'type_cleanediconic < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) < 0 -0.46 0.48 -1.26 0.31 4.94
## Post.Prob Star
## 1 0.83
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
To assess whether the models make reasonable assumptions about the underlying data-generating processes, we can simulate new data.
Regular bar plots:
# Any gesture model:
pp_check(gesture_mdl, type = "bars", ndraws = 100)
# Iconic gesture model:
pp_check(iconic_mdl, type = "bars", ndraws = 100)
# Other gesture model:
pp_check(other_mdl, type = "bars", ndraws = 100)
Grouped bar plots:
# Any gesture model:
pp_check(gesture_mdl, type = "bars_grouped", ndraws = 100,
group = 'word')
ggsave('../figures/gesture_mdl_pp_check.pdf',
width = 14, height = 12)
# Iconic gesture model:
pp_check(iconic_mdl, type = "bars_grouped", ndraws = 100,
group = 'word')
ggsave('../figures/iconic_mdl_pp_check.pdf',
width = 14, height = 12)
# Other gesture model:
pp_check(other_mdl, type = "bars_grouped", ndraws = 100,
group = 'word')
ggsave('../figures/other_mdl_pp_check.pdf',
width = 14, height = 12)
ECDF (empirical cumulative distribution function):
# Any gesture model:
pp_check(gesture_mdl, type = "ecdf_overlay", ndraws = 100)
# Iconic gesture model:
pp_check(iconic_mdl, type = "ecdf_overlay", ndraws = 100)
# Other gesture model:
pp_check(other_mdl, type = "ecdf_overlay", ndraws = 100)
This completes this analysis.
As it has been found that frequency influences gesture production rates, let’s incorporate that:
# load, rename, and log-transform
SUBTL <- read_csv('../data/SUBTLEX_US_with_POS.csv') |>
rename(freq = FREQcount) |>
rename(POS = Dom_PoS_SUBTLEX) |>
mutate(log_freq = log10(freq))
Merge this into by_word_all:
by_word_all <- left_join(by_word_all,
select(SUBTL, Word, POS, freq, log_freq),
by = c('word' = 'Word'))
Are iconic words different from non-iconic ones in terms of frequency?
by_word_all |>
mutate(type = as.character(type),
type = if_else(type == 'non-iconic word', 'low iconicity\nword', 'high iconicity\nword'),
type = factor(type, levels = c('low iconicity\nword', 'high iconicity\nword'))) |>
ggplot(aes(x = type, y = log_freq, fill = type)) +
geom_boxplot(width = 0.5) +
scale_fill_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
xlab(NULL) +
ylab('Log10 frequency') +
scale_y_continuous(limits = c(0, 5)) +
theme_classic() +
theme(legend.position = 'none') +
theme(axis.text.x = element_text(size = 10, face = 'bold'),
axis.title.y = element_text(margin = margin(r = 15),
size = 12, face = 'bold'))
ggsave(filename = '../figures/barplot_frequency_difference.pdf',
width = 3.5, height = 4.2)
Looks like a big difference. Check the raw frequency and log frequency descriptive stats difference:
by_word_all |>
group_by(type) |>
summarize(freq_M = mean(freq, na.rm = TRUE),
log_freq_M = mean(log_freq, na.rm = TRUE))
## # A tibble: 2 × 3
## type freq_M log_freq_M
## <fct> <dbl> <dbl>
## 1 non-iconic word 7408. 2.64
## 2 iconic word 160. 1.76
Check how big this difference is in terms of effect size (Cohen’s d):
cohen.d(log_freq ~ type, data = by_word_all)
##
## Cohen's d
##
## d estimate: 0.9058493 (large)
## 95 percent confidence interval:
## lower upper
## 0.2743632 1.5373355
Large effect size.
Quickly check whether frequency is correlated with any of the gesture rates, using Spearman’s rho, because these are proportions (but it doesn’t really matter, and Pearson’s r would be fine as well):
with(by_word_all, cor.test(log_freq, gesture_rate, method = 'spearman'))
## Warning in cor.test.default(log_freq, gesture_rate, method = "spearman"):
## Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: log_freq and gesture_rate
## S = 16678, p-value = 0.5191
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.09865885
with(by_word_all, cor.test(log_freq, iconic_gesture_rate, method = 'spearman'))
## Warning in cor.test.default(log_freq, iconic_gesture_rate, method =
## "spearman"): Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: log_freq and iconic_gesture_rate
## S = 15267, p-value = 0.97
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.005764155
with(by_word_all, cor.test(log_freq, other_gesture_rate, method = 'spearman'))
## Warning in cor.test.default(log_freq, other_gesture_rate, method = "spearman"):
## Cannot compute exact p-value with ties
##
## Spearman's rank correlation rho
##
## data: log_freq and other_gesture_rate
## S = 17487, p-value = 0.3191
## alternative hypothesis: true rho is not equal to 0
## sample estimates:
## rho
## -0.1519454
Perhaps Pearson’s r correlations are easier to interpret:
with(by_word_all, cor.test(log_freq, gesture_rate, method = 'pearson'))
##
## Pearson's product-moment correlation
##
## data: log_freq and gesture_rate
## t = -0.52008, df = 43, p-value = 0.6057
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3641455 0.2195670
## sample estimates:
## cor
## -0.07906252
with(by_word_all, cor.test(log_freq, iconic_gesture_rate, method = 'pearson'))
##
## Pearson's product-moment correlation
##
## data: log_freq and iconic_gesture_rate
## t = -0.075748, df = 43, p-value = 0.94
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3040537 0.2829425
## sample estimates:
## cor
## -0.01155072
with(by_word_all, cor.test(log_freq, other_gesture_rate, method = 'pearson'))
##
## Pearson's product-moment correlation
##
## data: log_freq and other_gesture_rate
## t = -0.5177, df = 43, p-value = 0.6073
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.3638327 0.2199102
## sample estimates:
## cor
## -0.07870418
Not much of any correlations at all. I’d perhaps report these as these are really easy to understand by many people, making it clear that these are on by-word averages.
Make a plot of this, also incorporating word iconicity
(type). For heuristic purposes we’ll also add a linear
model to see whether we can spot any obvious relations that differ
between iconic and non-iconic words:
by_word_all |>
ggplot(aes(x = log_freq, y = gesture_rate,
color = type, label = word)) +
geom_text() +
xlab('Log frequency') +
ylab('Gesture proportion') +
# geom_smooth(method = 'lm') +
scale_color_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
theme_classic() +
theme(legend.position = 'bottom',
axis.title = element_text(size = 12, face = 'bold'),
axis.title.y = element_text(margin = margin(r = 15)),
axis.title.x = element_text(margin = margin(t = 12)))
# Save:
ggsave('../figures/by_word_correlation_frequency_gesture_proportion.pdf',
width = 6.5, height = 5)
The massive overlap and wide scatter suggests that there’s not much of a relation.
by_word_all |>
ggplot(aes(x = log_freq, y = iconic_gesture_rate,
color = type, label = word)) +
geom_text() +
xlab('Log frequency') +
ylab('Iconic gesture proportion') +
# geom_smooth(method = 'lm') +
scale_color_manual(values = c('steelblue', 'goldenrod3'),
name = '') +
theme_classic() +
theme(legend.position = 'bottom')
# Save:
ggsave('../figures/by_word_correlation_frequency_iconic_gesture_proportion.pdf',
width = 6.5, height = 5)
Very weak positive relationship with more frequent words being more iconically gestured.
What happens if we incorporate frequency into the main analysis? For
this we also need to add it to df:
df <- left_join(df, select(SUBTL, Word, freq, log_freq),
by = c('word' = 'Word'))
The model has problems if type contains the paragraph
break we included above for data viz purposes.
df <- mutate(df,
type = as.character(type),
type = ifelse(type == 'non-iconic \nword',
'non_iconic',
'iconic'),
type = factor(type, levels = c('non_iconic', 'iconic')))
Check:
gesture_freq_mdl <- brm(gesture ~
# Fixed effects:
1 + type_cleaned + log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_freq_mdl, file = '../models/gesture_freq_mdl.RData')
Load:
load('../models/gesture_freq_mdl.RData')
Show model:
gesture_freq_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + type_cleaned + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.24 0.26 1.79 2.79 1.00 1945 3435
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.34 0.24 0.93 1.88 1.00 3061 5027
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.50 0.75 -0.95 1.97 1.00 3398 4431
## type_cleanediconic 0.94 0.56 -0.11 2.08 1.00 3803 5013
## log_freq 0.03 0.25 -0.46 0.52 1.00 3103 4584
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check effects:
hypothesis(gesture_freq_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 0.94 0.56 0.05 1.87 22.6
## Post.Prob Star
## 1 0.96 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_freq_mdl, 'log_freq > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (log_freq) > 0 0.03 0.25 -0.38 0.43 1.22 0.55
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_freq_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + type_cleaned + log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_freq_mdl, file = '../models/iconic_freq_mdl.RData')
Load:
load('../models/iconic_freq_mdl.RData')
Show model:
iconic_freq_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + type_cleaned + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.23 0.34 1.62 2.96 1.00 1940 3523
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.70 0.34 1.15 2.49 1.00 2439 4082
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -4.87 1.13 -7.20 -2.82 1.00 2895 3972
## type_cleanediconic 1.89 0.76 0.46 3.47 1.00 3257 5117
## log_freq 0.32 0.34 -0.32 1.01 1.00 2987 4360
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check effects:
hypothesis(iconic_freq_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 1.89 0.76 0.66 3.19 209.53
## Post.Prob Star
## 1 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_freq_mdl, 'log_freq > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (log_freq) > 0 0.32 0.34 -0.22 0.88 5.24 0.84
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Next look at other:
other_freq_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + type_cleaned + log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_freq_mdl, file = '../models/other_freq_mdl.RData')
Load:
load('../models/other_freq_mdl.RData')
Show model:
other_freq_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + type_cleaned + log_freq + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.35 0.29 1.83 2.96 1.00 1644 3065
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.23 0.24 0.82 1.77 1.00 2564 4419
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.43 0.74 -1.87 0.99 1.00 4402 4934
## type_cleanediconic -0.40 0.54 -1.47 0.67 1.00 4611 5034
## log_freq 0.06 0.24 -0.42 0.53 1.00 3849 5051
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check effects:
hypothesis(other_freq_mdl, 'type_cleanediconic < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) < 0 -0.4 0.54 -1.31 0.48 3.35
## Post.Prob Star
## 1 0.77
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_freq_mdl, 'log_freq > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (log_freq) > 0 0.06 0.24 -0.33 0.45 1.52 0.6
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Show the posterior:
gesture_mdl_posts <- posterior_samples(gesture_freq_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the type
coefficient:
# Plot basics:
gesture_posts_p <- gesture_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
gesture_posts_p <- gesture_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3.5, 3.5),
breaks = seq(-3, 3, 1))
# Look and feel:
gesture_posts_p <- gesture_posts_p +
ylab('Density') +
xlab('Posterior estimate') +
theme_classic()
# Show and save:
gesture_posts_p
ggsave(plot = gesture_posts_p, filename = '../figures/gesture_posterior_with_freq.pdf',
width = 4.7, height = 3)
Show the posterior of the type coefficient. First,
extract the posterior samples:
iconic_mdl_posts <- posterior_samples(iconic_freq_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the coefficient:
# Plot basics:
iconic_posts_p <- iconic_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
iconic_posts_p <- iconic_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3.5, 3.5),
breaks = seq(-3, 3, 1))
# Look and feel:
iconic_posts_p <- iconic_posts_p +
ylab('Density of posterior samples') +
xlab('Posterior estimate') +
theme_classic()
# Show and save:
iconic_posts_p
## Warning: Removed 186 rows containing non-finite outside the scale range
## (`stat_density()`).
ggsave(plot = iconic_posts_p, filename = '../figures/iconic_gesture_posterior_with_freq.pdf',
width = 4.7, height = 3)
## Warning: Removed 186 rows containing non-finite outside the scale range
## (`stat_density()`).
Show the posterior of the type coefficient. First,
extract the posterior samples:
other_mdl_posts <- posterior_samples(other_freq_mdl)
## Warning: Method 'posterior_samples' is deprecated. Please see ?as_draws for
## recommended alternatives.
Make a plot of the posterior distribution of the coefficient:
# Plot basics:
other_posts_p <- other_mdl_posts |>
ggplot(aes(x = b_type_cleanediconic)) +
geom_density(fill = 'purple3') +
geom_vline(xintercept = 0, linetype = 'dashed')
# Axes and labels:
other_posts_p <- other_posts_p +
scale_y_continuous(expand = c(0, 0),
limits = c(0, 1.2),
breaks = seq(0, 1.2, 0.2)) +
scale_x_continuous(limits = c(-3.5, 3.5),
breaks = seq(-3, 3, 1))
# Look and feel:
other_posts_p <- other_posts_p +
ylab('Density of posterior samples') +
xlab('Posterior estimate') +
theme_classic()
# Show and save:
other_posts_p
ggsave(plot = other_posts_p, filename = '../figures/other_gesture_posterior_with_freq.pdf',
width = 4.7, height = 3)
Put them all together into one plot:
gesture_posts_p <- gesture_posts_p +
ggtitle('a) Any gestures') +
theme(plot.title = element_text(face = 'bold', size = 12))
iconic_posts_p <- iconic_posts_p + ylab(NULL) +
ggtitle('b) Iconic gestures') +
theme(plot.title = element_text(face = 'bold', size = 12))
other_posts_p <- other_posts_p + ylab(NULL) +
ggtitle('c) Non-iconic gestures') +
theme(plot.title = element_text(face = 'bold', size = 12))
three_p <- gesture_posts_p + plot_spacer() + iconic_posts_p + plot_spacer() + other_posts_p +
plot_layout(widths = c(4, 0.1, 4, 0.1, 4))
ggsave(plot = three_p,
filename = '../figures/all_posteriors_with_frequency.pdf',
width = 9.5, height = 2.5)
## Warning: Removed 186 rows containing non-finite outside the scale range
## (`stat_density()`).
First, count how many unique adjectives and verbs there are:
distinct(df, word, POS) |>
count(POS)
## # A tibble: 2 × 2
## POS n
## <chr> <int>
## 1 adjective 15
## 2 verb 30
Check the average proportion of all gestures, separately for high/low iconicity and POS:
#Save:
all_gesture_by_type_POS <- df |>
count(POS, type, gesture) |>
group_by(POS, type) |>
mutate(proportion = n / sum(n),
percentage = str_c(round(proportion, 3) * 100, '%'))
#Show:
all_gesture_by_type_POS
## # A tibble: 8 × 6
## # Groups: POS, type [4]
## POS type gesture n proportion percentage
## <chr> <fct> <dbl> <int> <dbl> <chr>
## 1 adjective non_iconic 0 91 0.464 46.4%
## 2 adjective non_iconic 1 105 0.536 53.6%
## 3 adjective iconic 0 77 0.407 40.7%
## 4 adjective iconic 1 112 0.593 59.3%
## 5 verb non_iconic 0 423 0.495 49.5%
## 6 verb non_iconic 1 432 0.505 50.5%
## 7 verb iconic 0 73 0.239 23.9%
## 8 verb iconic 1 233 0.761 76.1%
Next, iconic gestures (over all gestures):
#Save:
iconic_gesture_by_type_POS_over_gestures <- df |>
count(POS, type, iconic_gesture) |>
filter(!is.na(iconic_gesture)) |>
group_by(POS, type) |>
mutate(proportion = n / sum(n),
percentage = str_c(round(proportion, 3) * 100, '%'))
#Show:
iconic_gesture_by_type_POS_over_gestures
## # A tibble: 8 × 6
## # Groups: POS, type [4]
## POS type iconic_gesture n proportion percentage
## <chr> <fct> <dbl> <int> <dbl> <chr>
## 1 adjective non_iconic 0 171 0.872 87.2%
## 2 adjective non_iconic 1 25 0.128 12.8%
## 3 adjective iconic 0 173 0.915 91.5%
## 4 adjective iconic 1 16 0.0847 8.5%
## 5 verb non_iconic 0 778 0.911 91.1%
## 6 verb non_iconic 1 76 0.0890 8.9%
## 7 verb iconic 0 213 0.696 69.6%
## 8 verb iconic 1 93 0.304 30.4%
Next, all other gestures (over all gestures):
#Save:
other_gesture_by_type_POS_over_gestures <- df |>
count(POS, type, non_iconic_gesture) |>
filter(!is.na(non_iconic_gesture)) |>
group_by(POS, type) |>
mutate(proportion = n / sum(n),
percentage = str_c(round(proportion, 3) * 100, '%'))
#Show:
other_gesture_by_type_POS_over_gestures
## # A tibble: 8 × 6
## # Groups: POS, type [4]
## POS type non_iconic_gesture n proportion percentage
## <chr> <fct> <chr> <int> <dbl> <chr>
## 1 adjective non_iconic gesture (non-iconic) 80 0.468 46.8%
## 2 adjective non_iconic no gesture 91 0.532 53.2%
## 3 adjective iconic gesture (non-iconic) 96 0.555 55.5%
## 4 adjective iconic no gesture 77 0.445 44.5%
## 5 verb non_iconic gesture (non-iconic) 355 0.456 45.6%
## 6 verb non_iconic no gesture 423 0.544 54.4%
## 7 verb iconic gesture (non-iconic) 140 0.657 65.7%
## 8 verb iconic no gesture 73 0.343 34.3%
To calculate the gesture rate over eligible tokens, we need to first calculate the number of eligible tokens.
# Save:
eligible_tokens_by_type_POS <- df |>
count(POS, type) |>
group_by(POS, type) |>
summarise(total = sum(n))
## `summarise()` has grouped output by 'POS'. You can override using the `.groups`
## argument.
# Show:
eligible_tokens_by_type_POS
## # A tibble: 4 × 3
## # Groups: POS [2]
## POS type total
## <chr> <fct> <int>
## 1 adjective non_iconic 196
## 2 adjective iconic 189
## 3 verb non_iconic 855
## 4 verb iconic 306
Now, calculate the number of iconic gestures over eligible tokens:
# Save:
iconic_gesture_by_type_POS_over_eligible_tokens <- df |>
count(POS, type, iconic_gesture) |>
left_join(eligible_tokens_by_type_POS, by = c("POS", "type")) |>
filter(iconic_gesture == 1) |>
group_by(POS, type) |>
mutate(proportion = n / total,
percentage = str_c(round(proportion, 3) * 100, '%'))
# Show:
iconic_gesture_by_type_POS_over_eligible_tokens
## # A tibble: 4 × 7
## # Groups: POS, type [4]
## POS type iconic_gesture n total proportion percentage
## <chr> <fct> <dbl> <int> <int> <dbl> <chr>
## 1 adjective non_iconic 1 25 196 0.128 12.8%
## 2 adjective iconic 1 16 189 0.0847 8.5%
## 3 verb non_iconic 1 76 855 0.0889 8.9%
## 4 verb iconic 1 93 306 0.304 30.4%
Now, other gestures (over eligible tokens):
# Save:
other_gesture_by_type_POS_over_eligible_tokens <- df |>
count(POS, type, non_iconic_gesture) |>
left_join(eligible_tokens_by_type_POS, by = c("POS", "type")) |>
filter(non_iconic_gesture == 'gesture (non-iconic)') |>
group_by(POS, type) |>
mutate(proportion = n / total,
percentage = str_c(round(proportion, 3) * 100, '%'))
# Show:
other_gesture_by_type_POS_over_eligible_tokens
## # A tibble: 4 × 7
## # Groups: POS, type [4]
## POS type non_iconic_gesture n total proportion percentage
## <chr> <fct> <chr> <int> <int> <dbl> <chr>
## 1 adjective non_iconic gesture (non-iconic) 80 196 0.408 40.8%
## 2 adjective iconic gesture (non-iconic) 96 189 0.508 50.8%
## 3 verb non_iconic gesture (non-iconic) 355 855 0.415 41.5%
## 4 verb iconic gesture (non-iconic) 140 306 0.458 45.8%
Now let’s calculate the average gesture rates by word, then by word type/POS (using ‘by_word_all’).
First, overall gesture rate:
# Group by the 'type' column and calculate the average gesture_rate for each group
average_gesture_rate_POS <- by_word_all %>%
group_by(type, POS) %>%
summarise(avg_gesture_rate = mean(gesture_rate, na.rm = TRUE))
## `summarise()` has grouped output by 'type'. You can override using the
## `.groups` argument.
# View the result
print(average_gesture_rate_POS)
## # A tibble: 4 × 3
## # Groups: type [2]
## type POS avg_gesture_rate
## <fct> <chr> <dbl>
## 1 non-iconic word Adjective 0.490
## 2 non-iconic word Verb 0.583
## 3 iconic word Adjective 0.551
## 4 iconic word Verb 0.768
Then, iconic gesture rate:
# Group by the 'type' column and calculate the average iconic_gesture_rate for each group
average_iconic_gesture_rate_POS <- by_word_all %>%
group_by(type, POS) %>%
summarise(avg_gesture_rate = mean(iconic_gesture_rate, na.rm = TRUE))
## `summarise()` has grouped output by 'type'. You can override using the
## `.groups` argument.
# View the result
print(average_iconic_gesture_rate_POS)
## # A tibble: 4 × 3
## # Groups: type [2]
## type POS avg_gesture_rate
## <fct> <chr> <dbl>
## 1 non-iconic word Adjective 0.125
## 2 non-iconic word Verb 0.102
## 3 iconic word Adjective 0.114
## 4 iconic word Verb 0.310
And finally, other gesture rate:
# Group by the 'type' column and calculate the average iconic_gesture_rate for each group
average_other_gesture_rate_POS <- by_word_all %>%
group_by(type, POS) %>%
summarise(avg_gesture_rate = mean(other_gesture_rate, na.rm = TRUE))
## `summarise()` has grouped output by 'type'. You can override using the
## `.groups` argument.
# View the result
print(average_other_gesture_rate_POS)
## # A tibble: 4 × 3
## # Groups: type [2]
## type POS avg_gesture_rate
## <fct> <chr> <dbl>
## 1 non-iconic word Adjective 0.365
## 2 non-iconic word Verb 0.479
## 3 iconic word Adjective 0.437
## 4 iconic word Verb 0.458
Let’s fit a model with a fixed effect of type, and now
added to this the POS effect, and random intercepts for
word and url because we have multiple data
points for each of these grouping factors. We cannot fit
type random slopes here because there is no possible
variation whatsoever of type within word or
url since each word or video is always either iconic or not
iconic.
This time around, it makes sense to sum-code POS and type (dummy codes of -1, +1) so that we can interpret main effects in the presence of interactions more easily.
# Make into factor:
df <- mutate(df,
type_c = factor(type, levels = c('iconic', 'non_iconic')),
POS_c = factor(POS, levels = c('verb', 'adjective')))
# Add sum-coding contrast coding scheme for 2 categories:
contrasts(df$type_c) <- contr.sum(2)
contrasts(df$POS_c) <- contr.sum(2)
# Check:
contrasts(df$type_c)
## [,1]
## iconic 1
## non_iconic -1
contrasts(df$POS_c)
## [,1]
## verb 1
## adjective -1
Now, fit the model:
gesture_POS_mdl <- brm(gesture ~
# Fixed effects:
1 + type_c + POS_c + type_c:POS_c +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_POS_mdl, file = '../models/gesture_POS_mdl.RData')
Load and show model:
# Load:
load('../models/gesture_POS_mdl.Rdata')
# Show:
gesture_POS_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + type_c + POS_c + type_c:POS_c + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.27 0.26 1.80 2.80 1.00 1625 3623
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.32 0.25 0.89 1.87 1.00 2289 4169
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 0.90 0.64 -0.34 2.17 1.00 3203 4554
## type_c1 0.43 0.31 -0.16 1.06 1.00 3076 4683
## POS_c1 0.44 0.27 -0.09 0.98 1.00 3734 4796
## log_freq 0.00 0.25 -0.51 0.51 1.00 2684 4036
## type_c1:POS_c1 0.27 0.28 -0.26 0.82 1.00 3184 4327
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
This is to calculate the actual posterior probability of the effect being positive (= of the same sign), which is essentially just pinning a number to what proportion of the area in the distribution above is to the right of the dashed line.
hypothesis(gesture_POS_mdl, 'type_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (type_c1) > 0 0.43 0.31 -0.07 0.96 11.03 0.92
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_POS_mdl, 'POS_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (POS_c1) > 0 0.44 0.27 -0.01 0.88 18 0.95
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_POS_mdl, 'type_c1:POS_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_c1:POS_c1) > 0 0.27 0.28 -0.18 0.72 5.16
## Post.Prob Star
## 1 0.84
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_POS_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + type_c + POS_c + type_c:POS_c +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_POS_mdl, file = '../models/iconic_POS_mdl.RData')
Load and show model:
# Load model:
load('../models/iconic_POS_mdl.RData')
# Show model:
iconic_POS_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + type_c + POS_c + type_c:POS_c + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.30 0.35 1.68 3.04 1.00 1757 3422
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.56 0.33 1.01 2.31 1.00 2678 4026
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -4.04 0.90 -5.97 -2.43 1.00 3308 4248
## type_c1 0.80 0.41 0.03 1.64 1.00 3526 4015
## POS_c1 0.63 0.34 -0.03 1.31 1.00 4211 4390
## log_freq 0.26 0.32 -0.36 0.91 1.00 3333 4309
## type_c1:POS_c1 0.52 0.34 -0.15 1.20 1.00 3736 4642
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(iconic_POS_mdl, 'type_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (type_c1) > 0 0.8 0.41 0.15 1.5 46.06 0.98 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_POS_mdl, 'POS_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (POS_c1) > 0 0.63 0.34 0.07 1.19 30.62 0.97 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_POS_mdl, 'type_c1:POS_c1 > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_c1:POS_c1) > 0 0.52 0.34 -0.04 1.09 15.16
## Post.Prob Star
## 1 0.94
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
The final model is the one with only the non-iconic gestures:
Same for non-iconic (= “other”) gestures:
other_POS_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + type_c + POS_c + type_c:POS_c +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_POS_mdl, file = '../models/other_POS_mdl.RData')
Load and show model:
# Load model:
load('../models/other_POS_mdl.RData')
# Show model:
other_POS_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + type_c + POS_c + type_c:POS_c + log_freq + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.37 0.29 1.84 3.00 1.00 1640 3239
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.29 0.26 0.84 1.86 1.00 2517 4594
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -0.56 0.63 -1.78 0.71 1.00 3730 5032
## type_c1 -0.21 0.31 -0.85 0.40 1.00 3515 4348
## POS_c1 -0.27 0.28 -0.83 0.28 1.00 4470 5249
## log_freq 0.06 0.25 -0.46 0.54 1.00 3152 4561
## type_c1:POS_c1 -0.13 0.28 -0.68 0.41 1.00 4180 4767
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(other_POS_mdl, 'type_c1 < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (type_c1) < 0 -0.21 0.31 -0.73 0.31 2.92 0.75
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_POS_mdl, 'POS_c1 < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob Star
## 1 (POS_c1) < 0 -0.27 0.28 -0.74 0.18 5.39 0.84
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_POS_mdl, 'type_c1:POS_c1 < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_c1:POS_c1) < 0 -0.13 0.28 -0.58 0.32 2.1
## Post.Prob Star
## 1 0.68
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Load Lancaster data:
lanc <- read_csv('../data/Lancaster_sensorimotor_norms_for_39707_words.csv')
# Show:
lanc
## # A tibble: 39,707 × 45
## Word Auditory.mean Gustatory.mean Haptic.mean Interoceptive.mean
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 A 2.21 0 0.429 0
## 2 A CAPPELLA 4.33 0 0.222 0.722
## 3 AARDVARK 1.62 0.562 1.62 0.0625
## 4 ABACK 1.29 0.0588 0.294 1.35
## 5 ABACUS 1.56 0.167 3.72 0.278
## 6 ABANDON 0.941 0.118 0.294 2.12
## 7 ABANDONED 1.44 0 0.0556 1.67
## 8 ABANDONEE 1.94 0 1.38 0.562
## 9 ABANDONER 0.571 0 0.143 1.64
## 10 ABANDONMENT 2.13 0.0667 1.2 2.93
## # ℹ 39,697 more rows
## # ℹ 40 more variables: Olfactory.mean <dbl>, Visual.mean <dbl>,
## # Foot_leg.mean <dbl>, Hand_arm.mean <dbl>, Head.mean <dbl>,
## # Mouth.mean <dbl>, Torso.mean <dbl>, Auditory.SD <dbl>, Gustatory.SD <dbl>,
## # Haptic.SD <dbl>, Interoceptive.SD <dbl>, Olfactory.SD <dbl>,
## # Visual.SD <dbl>, Foot_leg.SD <dbl>, Hand_arm.SD <dbl>, Head.SD <dbl>,
## # Mouth.SD <dbl>, Torso.SD <dbl>, Max_strength.perceptual <dbl>, …
There are only dominant auditory, gustatory, haptic, and visual words for the iconic words, so we’ll extract the perceptual strength measures only for that and for olfaction to complete the five senses.
Make Word lowercase, both in the column name and in the
actual content of the column:
lanc <- lanc |>
rename(word = Word,
dominant_modality = Dominant.perceptual,
dominant_action = Dominant.action,
auditory_strength = Auditory.mean,
visual_strength = Visual.mean,
haptic_strength = Haptic.mean,
gustatory_strength = Gustatory.mean,
olfactory_strength = Olfactory.mean,
hand_arm_mean = Hand_arm.mean,
max_strength = Max_strength.perceptual,
head_mean = Head.mean,
foot_leg_mean = Foot_leg.mean,
mouth_mean = Mouth.mean,
torso_mean = Torso.mean,
max_action = Max_strength.action) |>
mutate(word = str_to_lower(word))
Merge with main data:
df <- left_join(df,
select(lanc, word, dominant_modality, dominant_action,
auditory_strength, visual_strength, haptic_strength,
gustatory_strength, olfactory_strength,
hand_arm_mean, foot_leg_mean, mouth_mean, torso_mean, head_mean,
max_strength, max_action))
## Joining with `by = join_by(word)`
Merge with proportion table:
by_word_all <- left_join(by_word_all,
select(lanc, word, dominant_modality, dominant_action,
auditory_strength, visual_strength, haptic_strength,
gustatory_strength, olfactory_strength,
max_strength, max_action))
## Joining with `by = join_by(word)`
# Save outside of R:
by_word_all |>
write_csv('../data/by_word_proportions_with_modality.csv')
Which words are classified how?
select(by_word_all,
word, type, dominant_modality, dominant_action) |>
print(n = Inf)
## # A tibble: 45 × 4
## word type dominant_modality dominant_action
## <chr> <fct> <chr> <chr>
## 1 said non-iconic word Auditory Mouth
## 2 spank iconic word Haptic Hand_arm
## 3 slushy iconic word Visual Mouth
## 4 realize non-iconic word Interoceptive Head
## 5 inform non-iconic word Auditory Mouth
## 6 wearing non-iconic word Visual Foot_leg
## 7 yucky iconic word Gustatory Mouth
## 8 knew non-iconic word Interoceptive Head
## 9 filling non-iconic word Visual Mouth
## 10 other non-iconic word Visual Head
## 11 exact non-iconic word Visual Head
## 12 grateful non-iconic word Interoceptive Head
## 13 prevail non-iconic word Interoceptive Head
## 14 squish iconic word Haptic Hand_arm
## 15 tamper non-iconic word Visual Hand_arm
## 16 splotch iconic word Visual Head
## 17 put non-iconic word Visual Hand_arm
## 18 confirmed non-iconic word Visual Mouth
## 19 rejoin non-iconic word Visual Head
## 20 discern non-iconic word Visual Head
## 21 jealous non-iconic word Interoceptive Head
## 22 puffy iconic word Visual Head
## 23 covet non-iconic word Interoceptive Head
## 24 ordain non-iconic word Visual Mouth
## 25 swish iconic word Visual Head
## 26 wring iconic word Haptic Hand_arm
## 27 saggy iconic word Visual Head
## 28 swoosh iconic word Auditory Hand_arm
## 29 zap iconic word Haptic Hand_arm
## 30 outwit non-iconic word Visual Head
## 31 chomp iconic word Visual Mouth
## 32 crispy iconic word Haptic Mouth
## 33 absent non-iconic word Visual Head
## 34 wheeze iconic word Auditory Mouth
## 35 woof iconic word Auditory Head
## 36 sullen non-iconic word Visual Head
## 37 barking iconic word Auditory Head
## 38 acquaint non-iconic word Visual Head
## 39 bang iconic word Auditory Head
## 40 munch iconic word Gustatory Mouth
## 41 plump iconic word Visual Head
## 42 wobbly iconic word Visual Foot_leg
## 43 barren non-iconic word Visual Head
## 44 snap iconic word Auditory Hand_arm
## 45 gooey iconic word Haptic Hand_arm
Descriptive averages by dominant modality, also with counts:
# Create count table:
modality_type_counts <- by_word_all |>
count(type, dominant_modality)
# Create descriptive averages and merge with counts:
by_word_all |>
group_by(type, dominant_modality) |>
summarize(M_gesture = mean(gesture_rate),
M_iconic = mean(iconic_gesture_rate),
M_other = mean(other_gesture_rate)) |>
left_join(modality_type_counts)
## `summarise()` has grouped output by 'type'. You can override using the
## `.groups` argument.
## Joining with `by = join_by(type, dominant_modality)`
## # A tibble: 7 × 6
## # Groups: type [2]
## type dominant_modality M_gesture M_iconic M_other n
## <fct> <chr> <dbl> <dbl> <dbl> <int>
## 1 non-iconic word Auditory 0.262 0.0774 0.185 2
## 2 non-iconic word Interoceptive 0.488 0.0634 0.422 6
## 3 non-iconic word Visual 0.620 0.132 0.488 15
## 4 iconic word Auditory 0.717 0.238 0.479 6
## 5 iconic word Gustatory 0.532 0.118 0.414 2
## 6 iconic word Haptic 0.662 0.311 0.352 6
## 7 iconic word Visual 0.727 0.215 0.511 8
Clearly shows that visual words have the highest gesture rate, regardless of whether they are iconic or not, and for iconic words, haptic, visual, and auditory words have higher gesture rates than gustatory ones, although there’s only two of those, so we have to be careful not to over-interpret this as it very much hinges on these specific iconic words chosen. This data sparsity issue is only a concern with the categorical modality labels however, as of course the continuous ratings exist for all words.
Descriptive averages by dominant action:
# Create count table:
action_type_counts <- by_word_all |>
count(type, dominant_action)
# Create descriptive averages and merge with counts:
by_word_all |>
group_by(type, dominant_action) |>
summarize(M_gesture = mean(gesture_rate),
M_iconic = mean(iconic_gesture_rate),
M_other = mean(other_gesture_rate)) |>
left_join(action_type_counts)
## `summarise()` has grouped output by 'type'. You can override using the
## `.groups` argument.
## Joining with `by = join_by(type, dominant_action)`
## # A tibble: 8 × 6
## # Groups: type [2]
## type dominant_action M_gesture M_iconic M_other n
## <fct> <chr> <dbl> <dbl> <dbl> <int>
## 1 non-iconic word Foot_leg 0.714 0.0536 0.661 1
## 2 non-iconic word Hand_arm 0.705 0.325 0.380 2
## 3 non-iconic word Head 0.539 0.0973 0.441 15
## 4 non-iconic word Mouth 0.508 0.0702 0.438 5
## 5 iconic word Foot_leg 0.8 0.6 0.2 1
## 6 iconic word Hand_arm 0.704 0.275 0.429 7
## 7 iconic word Head 0.701 0.233 0.468 8
## 8 iconic word Mouth 0.637 0.143 0.494 6
There’s only two foot/leg words, so we shouldn’t really interpret
that. For iconic words, the action ratings don’t seem to matter as much
as the modality, with Mouth a bit lower.
Plot proportions as a function of auditory_strength and
the other perceptual measures. We’ll use a facet wrap for this and so
for that, it would make sense to have all the strength measures in long
format, with an indicator variable saying what strength it is:
by_word_long <- by_word_all |>
select(type, word, gesture_rate, iconic_gesture_rate, other_gesture_rate,
auditory_strength:olfactory_strength) |>
pivot_longer(cols = auditory_strength:olfactory_strength,
names_to = 'modality',
values_to = 'strength')
# Show a few random rows:
by_word_long |>
sample_n(10)
## # A tibble: 10 × 7
## type word gesture_rate iconic_gesture_rate other_gesture_rate modality
## <fct> <chr> <dbl> <dbl> <dbl> <chr>
## 1 iconic wo… woof 0.615 0.385 0.231 olfacto…
## 2 iconic wo… chomp 0.929 0.357 0.571 olfacto…
## 3 iconic wo… spank 0.691 0.124 0.567 haptic_…
## 4 iconic wo… plump 0.8 0 0.8 gustato…
## 5 iconic wo… cris… 0.714 0.0714 0.643 olfacto…
## 6 non-iconi… fill… 0.784 0.137 0.647 gustato…
## 7 iconic wo… bark… 0.636 0 0.636 auditor…
## 8 non-iconi… fill… 0.784 0.137 0.647 olfacto…
## 9 non-iconi… barr… 0.667 0.333 0.333 olfacto…
## 10 non-iconi… exact 0.826 0.217 0.609 gustato…
## # ℹ 1 more variable: strength <dbl>
Make a plot out of this, for overall gesture proportion:
# Plot basics:
modality_p <- by_word_long |>
ggplot(aes(x = strength, y = gesture_rate, col = type)) +
geom_smooth(method = 'lm') +
geom_point()
# Axis and labels:
modality_p <- modality_p +
ylab('Gesture proportion') +
xlab('Perceptual strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3')) +
facet_wrap(~modality)
# Look and feel:
modality_p <- modality_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
modality_p
## `geom_smooth()` using formula = 'y ~ x'
ggsave(plot = modality_p,
filename = '../figures/modality_scatterplot_matrix.pdf',
width = 12, height = 6)
## `geom_smooth()` using formula = 'y ~ x'
Same for iconic gesture proportion:
# Plot basics:
modality_iconicity_p <- by_word_long |>
ggplot(aes(x = strength, y = iconic_gesture_rate,
col = type)) +
geom_smooth(method = 'lm') +
geom_point() +
facet_wrap(~modality)
# Axis and labels:
modality_iconicity_p <- modality_iconicity_p +
ylab('Gesture proportion') +
xlab('Perceptual strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3'))
# Look and feel:
modality_iconicity_p <- modality_iconicity_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
modality_iconicity_p
## `geom_smooth()` using formula = 'y ~ x'
ggsave(plot = modality_iconicity_p,
filename = '../figures/modality_scatterplot_matrix_iconic_gestures.pdf',
width = 12, height = 6)
## `geom_smooth()` using formula = 'y ~ x'
Do the same but a single scatterplot with just the max strength, and then the max action.
Make a plot out of this, for overall gesture proportion:
# Plot basics:
modality_p <- by_word_all |>
ggplot(aes(x = max_strength, y = gesture_rate,
col = type, label = word)) +
geom_smooth(method = 'lm') +
geom_text()
# Axis and labels:
modality_p <- modality_p +
ylab('Gesture proportion') +
xlab('Max perceptual strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3'))
# Look and feel:
modality_p <- modality_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
modality_p
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggsave(plot = modality_p,
filename = '../figures/max_strength_scatterplot.pdf',
width = 6.5, height = 4.5)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
Not necessarily what one would expect, but pretty much anything is compatible with this data since the confidence regions are so wide.
Same for iconic gesture proportion:
# Plot basics:
modality_p <- by_word_all |>
ggplot(aes(x = max_strength, y = iconic_gesture_rate,
col = type, label = word)) +
geom_smooth(method = 'lm') +
geom_text()
# Axis and labels:
modality_p <- modality_p +
ylab('Iconic gesture proportion') +
xlab('Max perceptual strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3'))
# Look and feel:
modality_p <- modality_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
modality_p
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggsave(plot = modality_p,
filename = '../figures/max_strength_scatterplot_iconic_gestures.pdf',
width = 6.5, height = 4.5)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
This looks like it’s more the specific affordances of the verb semantics in terms of iconicity than anything in relation to how much sensory content there is overall.
Same with action strength:
Make a plot out of this, for overall gesture proportion:
# Plot basics:
action_p <- by_word_all |>
ggplot(aes(x = max_action, y = gesture_rate,
col = type, label = word)) +
geom_smooth(method = 'lm') +
geom_text()
# Axis and labels:
action_p <- action_p +
ylab('Gesture proportion') +
xlab('Max motor strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3'))
# Look and feel:
action_p <- action_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
action_p
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggsave(plot = action_p,
filename = '../figures/max_action_scatterplot.pdf',
width = 6.5, height = 4.5)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
Nothing much going on.
Same for iconic gesture proportion:
# Plot basics:
action_p <- by_word_all |>
ggplot(aes(x = max_action, y = iconic_gesture_rate,
col = type, label = word)) +
geom_smooth(method = 'lm') +
geom_text()
# Axis and labels:
action_p <- action_p +
ylab('Iconic gesture proportion') +
xlab('Max action strength rating') +
scale_color_manual(values = c('steelblue', 'goldenrod3'))
# Look and feel:
action_p <- action_p +
theme_classic() +
theme(legend.position = 'bottom')
# Show and save:
action_p
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
ggsave(plot = action_p,
filename = '../figures/max_action_scatterplot_iconic_gestures.pdf',
width = 6.5, height = 4.5)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: The following aesthetics were dropped during statistical transformation: label.
## ℹ This can happen when ggplot fails to infer the correct grouping structure in
## the data.
## ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
## variable into a factor?
Model with max perceptual strength, action strength, and iconicity:
gesture_mod_mdl <- brm(gesture ~
# Fixed effects:
1 + type_cleaned + max_strength + max_action +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_mod_mdl, file = '../models/gesture_mod_mdl.RData')
Load and show model:
# Load:
load('../models/gesture_mod_mdl.Rdata')
# Show:
gesture_mod_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + type_cleaned + max_strength + max_action + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.27 0.26 1.81 2.82 1.00 1661 3410
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.36 0.26 0.93 1.94 1.00 2149 3499
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.91 1.47 -1.16 4.72 1.00 2624 4024
## type_cleanediconic 1.32 0.67 0.03 2.68 1.00 2426 3770
## max_strength -0.52 0.44 -1.38 0.35 1.00 2402 3748
## max_action -0.02 0.30 -0.61 0.57 1.00 2457 3607
## log_freq 0.14 0.29 -0.42 0.71 1.00 2284 3221
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
This is to calculate the actual posterior probability of the effect being positive (= of the same sign), which is essentially just pinning a number to what proportion of the area in the distribution above is to the right of the dashed line.
hypothesis(gesture_mod_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 1.32 0.67 0.23 2.42 44.45
## Post.Prob Star
## 1 0.98 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_mod_mdl, 'max_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_strength) < 0 -0.52 0.44 -1.23 0.2 7.63 0.88
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_mod_mdl, 'max_action < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_action) < 0 -0.02 0.3 -0.51 0.47 1.13 0.53
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_mod_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + type_cleaned + max_strength + max_action +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_mod_mdl, file = '../models/iconic_mod_mdl.RData')
Load and show model:
# Load model:
load('../models/iconic_mod_mdl.RData')
# Show model:
iconic_mod_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + type_cleaned + max_strength + max_action + log_freq + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.29 0.35 1.67 3.03 1.00 1743 3505
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.80 0.35 1.21 2.59 1.00 2616 4976
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -4.79 2.07 -9.19 -0.86 1.00 3634 4733
## type_cleanediconic 1.97 0.88 0.27 3.75 1.00 3169 4473
## max_strength -0.05 0.57 -1.18 1.08 1.00 3144 4369
## max_action -0.03 0.37 -0.76 0.70 1.00 3310 4280
## log_freq 0.35 0.38 -0.38 1.10 1.00 2984 4390
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(iconic_mod_mdl, 'type_cleanediconic > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) > 0 1.97 0.88 0.54 3.45 85.96
## Post.Prob Star
## 1 0.99 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_mod_mdl, 'max_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_strength) < 0 -0.05 0.57 -0.97 0.91 1.16 0.54
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_mod_mdl, 'max_action < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_action) < 0 -0.03 0.37 -0.63 0.58 1.16 0.54
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
The final model is the one with only the non-iconic gestures:
Same for non-iconic (= “other”) gestures:
other_mod_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + type_cleaned + max_strength + max_action +
log_freq +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_mod_mdl, file = '../models/other_mod_mdl.RData')
Load and show model:
# Load model:
load('../models/other_mod_mdl.RData')
# Show model:
other_mod_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + type_cleaned + max_strength + max_action + log_freq + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.37 0.29 1.85 3.00 1.00 1693 3184
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.27 0.26 0.83 1.84 1.00 2219 4253
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -1.80 1.41 -4.56 1.06 1.00 3006 4337
## type_cleanediconic -0.79 0.64 -2.11 0.44 1.00 2641 3749
## max_strength 0.51 0.43 -0.32 1.35 1.00 2503 3795
## max_action 0.03 0.28 -0.54 0.58 1.00 2892 3504
## log_freq -0.06 0.27 -0.63 0.47 1.00 2207 3781
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(other_mod_mdl, 'type_cleanediconic < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (type_cleanediconic) < 0 -0.79 0.64 -1.86 0.25 8.9
## Post.Prob Star
## 1 0.9
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_mod_mdl, 'max_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_strength) > 0 0.51 0.43 -0.18 1.21 8.07 0.89
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_mod_mdl, 'max_action > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (max_action) > 0 0.03 0.28 -0.44 0.49 1.19 0.54
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Model with all other variables:
gesture_all_senses_mdl <- brm(gesture ~
# Fixed effects:
1 + auditory_strength +
visual_strength +
haptic_strength +
gustatory_strength +
olfactory_strength +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_all_senses_mdl, file = '../models/gesture_all_senses_mdl.RData')
Load and show model:
# Load:
load('../models/gesture_all_senses_mdl.Rdata')
# Show:
gesture_all_senses_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + auditory_strength + visual_strength + haptic_strength + gustatory_strength + olfactory_strength + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.31 0.27 1.83 2.89 1.00 1318 2752
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.25 0.26 0.82 1.82 1.00 2101 4090
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -1.25 1.22 -3.60 1.13 1.00 2383 3960
## auditory_strength 0.19 0.24 -0.27 0.67 1.00 2530 3881
## visual_strength 0.56 0.36 -0.15 1.26 1.00 2455 4097
## haptic_strength 0.34 0.26 -0.18 0.87 1.00 2765 3799
## gustatory_strength -0.08 0.43 -0.95 0.77 1.00 2564 3537
## olfactory_strength -0.22 0.68 -1.55 1.13 1.00 3341 4139
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
This is to calculate the actual posterior probability of the effect being positive (= of the same sign), which is essentially just pinning a number to what proportion of the area in the distribution above is to the right of the dashed line.
hypothesis(gesture_all_senses_mdl, 'auditory_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (auditory_strength) > 0 0.19 0.24 -0.19 0.59 3.59
## Post.Prob Star
## 1 0.78
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_senses_mdl, 'visual_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (visual_strength) > 0 0.56 0.36 -0.03 1.14 16.02
## Post.Prob Star
## 1 0.94
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_senses_mdl, 'haptic_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (haptic_strength) > 0 0.34 0.26 -0.08 0.77 9.77
## Post.Prob Star
## 1 0.91
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_senses_mdl, 'gustatory_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (gustatory_strength) < 0 -0.08 0.43 -0.8 0.62 1.35
## Post.Prob Star
## 1 0.57
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_senses_mdl, 'olfactory_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (olfactory_strength) < 0 -0.22 0.68 -1.33 0.9 1.75
## Post.Prob Star
## 1 0.64
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_all_senses_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + auditory_strength +
visual_strength +
haptic_strength +
gustatory_strength +
olfactory_strength +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_all_senses_mdl, file = '../models/iconic_all_senses_mdl.RData')
Load and show model:
# Load model:
load('../models/iconic_all_senses_mdl.RData')
# Show model:
iconic_all_senses_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + auditory_strength + visual_strength + haptic_strength + gustatory_strength + olfactory_strength + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.35 0.35 1.71 3.10 1.00 1646 3235
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.72 0.36 1.11 2.54 1.00 2410 4322
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -5.72 1.75 -9.18 -2.33 1.00 2959 4408
## auditory_strength 0.55 0.32 -0.07 1.18 1.00 3319 4191
## visual_strength 0.08 0.49 -0.92 1.02 1.00 3088 4434
## haptic_strength 0.86 0.35 0.19 1.56 1.00 2817 4184
## gustatory_strength -0.48 0.56 -1.64 0.58 1.00 3755 4239
## olfactory_strength 0.06 0.90 -1.70 1.87 1.00 4540 5507
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(iconic_all_senses_mdl, 'auditory_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (auditory_strength) > 0 0.55 0.32 0.02 1.08 21.86
## Post.Prob Star
## 1 0.96 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_senses_mdl, 'visual_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (visual_strength) > 0 0.08 0.49 -0.74 0.86 1.33
## Post.Prob Star
## 1 0.57
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_senses_mdl, 'haptic_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (haptic_strength) > 0 0.86 0.35 0.29 1.45 132.33
## Post.Prob Star
## 1 0.99 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_senses_mdl, 'gustatory_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (gustatory_strength) < 0 -0.48 0.56 -1.43 0.42 4.33
## Post.Prob Star
## 1 0.81
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_senses_mdl, 'olfactory_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (olfactory_strength) > 0 0.06 0.9 -1.41 1.55 1.11
## Post.Prob Star
## 1 0.53
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
The final model is the one with only the non-iconic gestures:
Same for non-iconic (= “other”) gestures:
other_all_senses_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + auditory_strength +
visual_strength +
haptic_strength +
gustatory_strength +
olfactory_strength +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_all_senses_mdl, file = '../models/other_all_senses_mdl.RData')
Load and show model:
# Load model:
load('../models/other_all_senses_mdl.RData')
# Show model:
other_all_senses_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + auditory_strength + visual_strength + haptic_strength + gustatory_strength + olfactory_strength + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.40 0.29 1.87 3.02 1.00 1808 3154
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.09 0.26 0.63 1.65 1.00 1750 2606
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.18 1.14 -1.09 3.48 1.00 2829 4175
## auditory_strength 0.03 0.22 -0.42 0.47 1.00 3079 4285
## visual_strength -0.65 0.35 -1.35 0.03 1.00 2700 3874
## haptic_strength -0.01 0.25 -0.52 0.48 1.00 3209 4163
## gustatory_strength -0.21 0.43 -1.04 0.66 1.00 3092 4328
## olfactory_strength 0.45 0.66 -0.85 1.76 1.00 3306 4590
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(other_all_senses_mdl, 'auditory_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (auditory_strength) > 0 0.03 0.22 -0.34 0.39 1.24
## Post.Prob Star
## 1 0.55
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_senses_mdl, 'visual_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (visual_strength) < 0 -0.65 0.35 -1.23 -0.1 33.48
## Post.Prob Star
## 1 0.97 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_senses_mdl, 'haptic_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (haptic_strength) < 0 -0.01 0.25 -0.43 0.4 1.05
## Post.Prob Star
## 1 0.51
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_senses_mdl, 'gustatory_strength < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (gustatory_strength) < 0 -0.21 0.43 -0.91 0.51 2.26
## Post.Prob Star
## 1 0.69
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_senses_mdl, 'olfactory_strength > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio
## 1 (olfactory_strength) > 0 0.45 0.66 -0.63 1.54 3.08
## Post.Prob Star
## 1 0.76
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Model with all other variables:
gesture_all_motor_mdl <- brm(gesture ~
# Fixed effects:
1 + hand_arm_mean +
head_mean +
foot_leg_mean +
mouth_mean +
torso_mean +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(gesture_all_motor_mdl, file = '../models/gesture_all_motor_mdl.RData')
Load and show model:
# Load:
load('../models/gesture_all_motor_mdl.Rdata')
# Show:
gesture_all_motor_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: gesture ~ 1 + hand_arm_mean + head_mean + foot_leg_mean + mouth_mean + torso_mean + (1 | word) + (1 | url)
## Data: df (Number of observations: 1546)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.31 0.27 1.82 2.86 1.00 1661 3496
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.13 0.25 0.71 1.67 1.00 1947 3368
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.89 1.18 -0.42 4.24 1.00 2397 3472
## hand_arm_mean 0.34 0.30 -0.24 0.97 1.00 2478 3786
## head_mean -0.26 0.34 -0.94 0.42 1.00 2519 3709
## foot_leg_mean 0.54 0.44 -0.34 1.39 1.00 2657 4013
## mouth_mean -0.35 0.21 -0.76 0.06 1.00 2536 3811
## torso_mean -0.86 0.46 -1.77 0.04 1.00 2900 3744
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
This is to calculate the actual posterior probability of the effect being positive (= of the same sign), which is essentially just pinning a number to what proportion of the area in the distribution above is to the right of the dashed line.
hypothesis(gesture_all_motor_mdl, 'hand_arm_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (hand_arm_mean) > 0 0.34 0.3 -0.15 0.85 6.84 0.87
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_motor_mdl, 'head_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (head_mean) < 0 -0.26 0.34 -0.83 0.3 3.79 0.79
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_motor_mdl, 'foot_leg_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (foot_leg_mean) > 0 0.54 0.44 -0.18 1.26 8.21 0.89
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_motor_mdl, 'torso_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (torso_mean) < 0 -0.86 0.46 -1.61 -0.12 31.39 0.97
## Star
## 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(gesture_all_motor_mdl, 'mouth_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (mouth_mean) < 0 -0.35 0.21 -0.69 -0.02 22.88 0.96
## Star
## 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
Same for iconic gestures:
iconic_all_motor_mdl <- brm(iconic_gesture ~
# Fixed effects:
1 + hand_arm_mean +
head_mean +
foot_leg_mean +
mouth_mean +
torso_mean +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = df,
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(iconic_all_motor_mdl, file = '../models/iconic_all_motor_mdl.RData')
Load and show model:
# Load model:
load('../models/iconic_all_motor_mdl.RData')
# Show model:
iconic_all_motor_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: iconic_gesture ~ 1 + hand_arm_mean + head_mean + foot_leg_mean + mouth_mean + torso_mean + (1 | word) + (1 | url)
## Data: df (Number of observations: 1545)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 837)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.30 0.35 1.68 3.04 1.00 1600 3345
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 0.93 0.36 0.22 1.69 1.00 1011 902
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -1.92 1.20 -4.34 0.51 1.00 3817 4488
## hand_arm_mean 0.57 0.31 -0.06 1.18 1.00 3198 4340
## head_mean -0.55 0.36 -1.31 0.13 1.00 3965 4720
## foot_leg_mean 1.41 0.43 0.57 2.26 1.00 3629 4409
## mouth_mean -0.32 0.21 -0.74 0.10 1.00 3815 4733
## torso_mean -2.15 0.53 -3.23 -1.13 1.00 3179 4131
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(iconic_all_motor_mdl, 'hand_arm_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (hand_arm_mean) > 0 0.57 0.31 0.05 1.06 25.85 0.96
## Star
## 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_motor_mdl, 'head_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (head_mean) < 0 -0.55 0.36 -1.18 0.03 16.2 0.94
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_motor_mdl, 'foot_leg_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (foot_leg_mean) > 0 1.41 0.43 0.71 2.1 799 1
## Star
## 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_motor_mdl, 'mouth_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (mouth_mean) < 0 -0.32 0.21 -0.66 0.02 16.09 0.94
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(iconic_all_motor_mdl, 'torso_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (torso_mean) < 0 -2.15 0.53 -3.03 -1.29 Inf 1
## Star
## 1 *
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
The final model is the one with only the non-iconic gestures:
Same for non-iconic (= “other”) gestures:
other_all_motor_mdl <- brm(non_iconic_gesture ~
# Fixed effects:
1 + hand_arm_mean +
head_mean +
foot_leg_mean +
mouth_mean +
torso_mean +
# Random effects:
(1|word) + (1|url),
# More model specs:
prior = weak_priors,
family = bernoulli,
data = filter(df, !is.na(non_iconic_gesture)),
# MCMC estimation specs:
seed = 42, cores = 4,
iter = 4000, warmup = 2000)
# Save:
save(other_all_motor_mdl, file = '../models/other_all_motor_mdl.RData')
Load and show model:
# Load model:
load('../models/other_all_motor_mdl.RData')
# Show model:
other_all_motor_mdl
## Family: bernoulli
## Links: mu = logit
## Formula: non_iconic_gesture ~ 1 + hand_arm_mean + head_mean + foot_leg_mean + mouth_mean + torso_mean + (1 | word) + (1 | url)
## Data: filter(df, !is.na(non_iconic_gesture)) (Number of observations: 1335)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Multilevel Hyperparameters:
## ~url (Number of levels: 736)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 2.42 0.30 1.89 3.04 1.00 1490 3121
##
## ~word (Number of levels: 45)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sd(Intercept) 1.20 0.27 0.74 1.78 1.00 2013 3150
##
## Regression Coefficients:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept -1.07 1.24 -3.54 1.36 1.00 2261 3505
## hand_arm_mean -0.22 0.32 -0.87 0.41 1.00 2582 3706
## head_mean 0.09 0.36 -0.61 0.80 1.00 2417 3939
## foot_leg_mean 0.07 0.49 -0.89 1.04 1.00 2761 4059
## mouth_mean 0.32 0.22 -0.12 0.76 1.00 2108 2863
## torso_mean 0.18 0.50 -0.79 1.19 1.00 2870 3893
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Check how many of them are of the same sign:
hypothesis(other_all_motor_mdl, 'hand_arm_mean < 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (hand_arm_mean) < 0 -0.22 0.32 -0.75 0.3 3.06 0.75
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_motor_mdl, 'head_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (head_mean) > 0 0.09 0.36 -0.5 0.68 1.53 0.6
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_motor_mdl, 'foot_leg_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (foot_leg_mean) > 0 0.07 0.49 -0.73 0.89 1.24 0.55
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_motor_mdl, 'mouth_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (mouth_mean) > 0 0.32 0.22 -0.04 0.69 12.49 0.93
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
hypothesis(other_all_motor_mdl, 'torso_mean > 0')
## Hypothesis Tests for class b:
## Hypothesis Estimate Est.Error CI.Lower CI.Upper Evid.Ratio Post.Prob
## 1 (torso_mean) > 0 0.18 0.5 -0.63 1.02 1.81 0.64
## Star
## 1
## ---
## 'CI': 90%-CI for one-sided and 95%-CI for two-sided hypotheses.
## '*': For one-sided hypotheses, the posterior probability exceeds 95%;
## for two-sided hypotheses, the value tested against lies outside the 95%-CI.
## Posterior probabilities of point hypotheses assume equal prior probabilities.
First, let’s see how for all gestures, the iconicity model competes against the all sensory strength and all motor strength model:
bayes_R2(gesture_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4848634 0.02663119 0.4309332 0.5354387
bayes_R2(gesture_all_senses_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4915309 0.02622017 0.4374499 0.5409381
bayes_R2(gesture_all_motor_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4914347 0.02585352 0.4388202 0.5403105
Next, same thing for iconic gestures only:
bayes_R2(iconic_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4547021 0.03812971 0.3773427 0.5251493
bayes_R2(iconic_all_senses_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4665948 0.03773457 0.3899429 0.5385222
bayes_R2(iconic_all_motor_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4587925 0.03801456 0.3837259 0.5309054
Then for all other gestures:
bayes_R2(other_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4875807 0.02941444 0.4278217 0.5429119
bayes_R2(other_all_senses_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4951544 0.02871305 0.4369144 0.5490764
bayes_R2(other_all_motor_mdl)
## Estimate Est.Error Q2.5 Q97.5
## R2 0.4973807 0.02874939 0.4390567 0.5502481
This completes this analysis.